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Almost-Periodic Response Determination for 
Models of the Basilar Membrane 


By |. W. SANDBERG and J. B. ALLEN* 
(Manuscript received April 8, 1985) 


Electrical networks consisting of linear passive elements and many nonlin- 
ear resistors are often used to model the basilar membrane. The inputs to 
these networks are typically a sum of sinusoids switched on at t = 0, and the 
resulting quantities of interest because of their interpretation as analogs of 
experimental observables are the steady-state response components of a cer- 
tain current and of certain voltages. In this paper, recently obtained mathe- 
matical results concerning the input-output representation of nonlinear sys- 
tems are used to give, for the first time, a locally convergent expansion for all 
of the steady-state quantities of interest. Also given is a good deal of infor- 
mation concerning general properties of the expansion, and this establishes 
important properties of the nonlinear network’s response. Of particular prac- 
tical interest is a term in the expansion that contains a component whose 
frequency is (2/; — f2) when the network’s input consists of a sum of two 
sinusoids, with frequencies f, and f.. One of our main results is an explicit 
expression for this (2/, — f.) component. 


I. INTRODUCTION 


Electrical networks of the type shown in Fig. 1, together with 
sophisticated frequency-domain measurement techniques, play a cen- 
tral role in the modeling and analysis of the peripheral auditory 
system.’? In the figure—which shows a one-dimensional lumped- 
element transmission-line model of the basilar membrane—the induc- 
tors and capacitors are linear, the box at the upper left contains 


* Authors are employees of AT&T Bell Laboratories. 
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Fig. 1—Network model. 


lumped elements, and, as indicated, the resistors are nonlinear. The 
voltage e) applied to the network is typically a finite sum of sinusoids 
(often a sum of just two sinusoids) switched on at some finite time 
that we take to be t = 0. The resulting quantities of interest, because 
of their interpretation as analogs of experimental observables, are the 
steady-state response components of the current ip and of one or more 
of the voltages s, --- , Sq. 

In models of interest today the number gq of nonlinear resistors is 
typically taken to be between 200 and 500. The resistors are assumed 
to be current controlled, with each current-voltage relationship often 
represented by the sum of linear and cubic terms.!* 

The purpose of this paper is to use recently obtained results! 
concerning the input-output representation of nonlinear systems to 
give, for the first time, an expansion for all of the steady-state 
quantities of interest in Fig. 1. The expansion is in terms of ey and is 
locally convergent. By this we mean that whenever the sum of the 
Fourier coefficients of eo is sufficiently small, and some reasonable 
additional conditions are met, the steady-state quantities exist and 
are given by the sum of the terms in the expansion, with each term 
dependent on the frequencies and Fourier coefficients of e¢9. We em- 
phasize that the expansion provides an exact representation of the 
response; it is not merely an approximation or a formal expansion 
whose convergence has not been proved. However, in this paper we do 
not give lower bounds on the size of the region of convergence. 
Questions of this type are the subject of ongoing studies.’ 

In Section II it will become clear that the terms in the expansion 
are defined by a certain recursive process. Of particular practical 
interest at the present time is the term we call the third-order term, 
which contains a component whose (radian) frequency is (20; — we) 
when ép consists of a sum of two sinusoids, one of frequency w;, and 
another of frequency w2. One of our main results is an explicit expres- 
sion for this (2w; — w2) component, under some very reasonable 
assumptions. 
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Fig. 2—More general network model. 


Il. EXISTENCE, PROPERTIES, AND EVALUATION OF THE STEADY- 
STATE QUANTITIES 


2.1 Formulation 


To enable attention to be more sharply focused on the concepts of 
importance to us, it is helpful to generalize our problem. Thus, we 
consider instead of Fig. 1 the network of Fig. 2, in which / is a linear 
time-invariant network and s;, --- , sp are voltages in .”, measured 
with respect to the ground terminal, where p is any positive integer. 

Let i and e, respectively, denote the transpose of the current and 
voltage row vectors (1, +++ , 1g) and (e1, --+ , eg). Assume that ” has 
the representation 


u(t) = f h(t — r)eo(r)dr + f h(t — r)e(r)dr + uy(t), 


t2=0, (1) 


in which h, and h, are gq * 1 and q x q matrix-valued impulse response 
functions and u; (which takes into account initial conditions) is a 
bounded continuous function that approaches zero as t — ©,* Simi- 
larly, let r stand for the transpose of the response (s;, --- , Sp, Io) and 
suppose that there are (p + 1) x 1 and (p + 1) * q matrix-valued 
impulse response functions hg and hy, respectively, for which 


*We could have assumed that u, and the transient functions uz and uz to be 
introduced are all zero functions. However, we wish to establish that the steady-state 
responses are robust with respect to these functions in the strong sense that, under the 
conditions to be described, they are independent of them. 
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r(t) = f ha(t — r)eo(r)dr + i h,(t — r)e(r)dr + u2(t), 


t20, (2) 


where wuz is also a bounded continuous function that approaches zero 
ast— , 

Each element of h,, hy, h,, and hg is assumed to be an absolutely 
integrable real-valued function on [0, ©) with possibly an impulse at 
t = 0. We use H, to denote the Fourier transform of hg, 1.e., 


H,(w) = i h,(t)e?“dt, —0 <w< om, 


Similarly, H,, H., and Hz stand for the Fourier transforms of hy, h,, 
and hg, respectively. Of course H,(w), H,(w), H.(w), and Ha(w) are also 
matrices. Notice that, from (1) and (2), each of these matrices has a 
natural transfer-function interpretation. For example, from (1) we see 
that the elements of H, are the voltage-to-current transfer functions 
from the system input é to the “inputs” 1 of the nonlinear resistors, 
when these resistors are replaced with short circuits. 
The nonlinear resistors in Fig. 2 are assumed to be represented by 


exn(t) = Reli(t)], (R=1,---,9q) (3) 


with each R, an analytic function in some neighborhood of the origin 
of the complex plane, such that R;,(z) is real when z is real, R;,(0) = 0, 
and dR,(z)/dz = 0 at z = 0. (In particular, the R; can be polynomials 
with real coefficients.) In Fig. 1 the nonlinear resistors typically have 
a relatively large linear part. These linear parts can be taken into 
account in Fig. 2 in .”. Using known properties of networks with 
positive elements, it is not difficult to show that the assumptions made 
above concerning ha, hy, h,, ha, Ui, and uz are satisfied for the network 
of Fig. 1 when put in the form of Fig. 2, as long as the linear part of 
each resistor has positive resistance, all linear elements are passive, 
the impedance of the two-terminal box is not zero at zero frequency, 
and each s;, in Fig. 2 is an s; in Fig. 1. 


2.2 Steady-state responses: properties and evaluation 


We now assume that é is given by 


fo) 


eo(t) = Y axe’ +us(t), t 20, 
in which the sum of the | a,| is finite; j = (—1)’; the w, are real; and 
U3, like u; and Ue, is bounded, continuous, and approaches zero as t > 
oo, We do not require that the w, are multiples of some fixed constant. 
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Thus, the input is assumed to be the sum, for ¢t = 0, of a so-called 
“almost-periodic” signal 

Yael’, -o<t<oa (4) 
and a transient part u3. Although all almost-periodic signals have a 
generalized Fourier series of the form (4), the sum of the magnitudes 
of the Fourier coefficients need not be finite. We shall use AP to 
denote the subset of almost-periodic functions for which this sum is 
finite. 

At this point we are able to state our main result, which is: Under 
the assumptions already discussed, and for );--.~ | a,| as well as uw, 
U2, and uz sufficiently small,* 

1. There are unique bounded functions 1, e, and r that satisfy eqs. 
(1), (2), and (3), and (regarding uniqueness) a certain very reasonable 
neighborhood condition’ concerning i, 

2. There is a (p + 1)-vector-valued function r,, defined on (—~, 0), 
with each of its (p + 1) components belonging to (AP), such that 


r(t) —r,5(t)} ~0 as too 


(i.e., the response r approaches the steady state r,, as t > ©), and 
3. Mss is independent of u;, U2, and u3. It is given by 


Iss (t) = x [rss(t)]m, —~<t< o, (5) 


in which the [r,,(-)]m are (p + 1)-vector-valued functions, with com- 
ponents belonging to AP, defined by 


[rss(t)]i = 2 Ha(w,)a,e"*# 


and 


Co 


[res(t)Jm = DY ees DL Bm(Wiyy +++ y Ohy)Ok, +++ Ap,er@nt tem * — (6) 

ky=—20 R= 00 

for m = 2, where the Bn(wz,, +++ , wx,,), Which depend on Ha, Ho, H., 

and the derivatives of the R, at the origin, but not on the coefficients 

a,, are defined by the recursive relations (10), (11), and (12) in the 
Appendix. The infinite sum in (5) converges uniformly in t. 

Notice that a fundamental property of the class of network models 


* By “small” for u;, ue, and u3 is meant small in the reasonable sense of the @) norm 
of Ref. 10, p. 692. 


* The condition is simply that the function 7 must lie in a certain neighborhood of 
the origin. See the first of the two footnotes in the Appendix. 
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considered is that, with excitations as indicated, each component of 
any steady-state response r,, belongs to AP. In particular, the r,, are 
well behaved; any r,, is continuous in ¢ and has a Fourier series, and 
the Fourier series converges to r,,(t) for each t. 

It is shown in the Appendix that the result described above follows 
from the main theorem in Ref. 10. In addition, bounds in Ref. 10, 
Section 2.4.3 show that the following can be added to 1 through 3. 

4, There are positive constants a and # such that, with ([r.;(t)]m)z 
the kth component of [r.;(t)]m, 


0 oo (M+1) 
> max | ([rss(t) ma | Sa (s ») nl ’ —~a~<t{< o, 
m=(M-+1) k k=—o 

for any positive integer M [which provides useful information con- 
cerning the error in discarding all terms in (5) beyond the Mth]. 


2.3 The (2w, — w2) component of [r,,(t)]s 


Each [r,,(t)]m in (5) is of order m in the sense that the effect of 
multiplying all of the Fourier coefficients of eo by a constant X is to 
cause [r;;(t)]m to be replaced by X”[r.;(t)]m. Of particular interest in 
applications is an explicit expression for T'(w;, we, ai, a2), the compo- 
nent at the frequency (2w; — we) of the third order term in (5), when 


— Jot Sie lwot —jwot 
yp = aye’! + a_je 7! + age’ + a_,e Je", 


where a_, and a_z are the complex conjugates of a; and az, respectively, 
0 < a1 < we < 2w;,* and a, (1) = 0 (k= 1, --- , qg) for 1 = 2, and where 
here and in the Appendix a;,(1) denotes d'R;(z)/dz' | 0. 

Under the condition on the a; (2) indicated, the expression (12) for 
the &,, yields 


1 : 
B3(wr,, Who» Wh, ) = 6 Hy (wr, + wr, + wp, )diag[a (3), me Ry ag(3)] 


-X[Ha(wn,), Ha(on,), Ha(on,)I, 


where “diag” indicates a diagonal matrix and x[Ha(wr,), Ha(wr,), 
H,(wx,)] denotes the g-vector whose kth element is the product 
[Ha (we, )]e[Ha(wr,)]e[Ha(wr,)], of kth elements for each k. Thus, using 
(6) with m = 8, ao = 0, and a, = 0 for | k| > 2, as well as the observation 
that (wp, + wr, + wr,) = (2w1 — we) only if one of the w;, is —w2 and 


* For w, and w2 that meet these conditions, (2w; — we) is not equal to w;, w2, 301, 3we 
or (2w2 — w,), which are the only other positive frequencies at which [r,,(t)]; can have 
components. However, higher-order terms of odd index can possess components at 
(2w, — we). For example, (wp, + +++ + wes) = (201 — we) if wa, = Wry = W1, Why = —w2, and 
Wk, + Ok, = 0. 
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the other two are equal to w, it easily follows that the sum of the 
coefficients of exp[j (2w; — w2)t] in [R,.(t)]3 is 


; Hi, (201 = we )diagla1 (3), bea aq(3)]X[Ha(w1), H,(1), H,(—w2)]aja-2. 


This shows that 
T(w1, @2, G1, dg) = Re{H,(2wi — w2)diag[a:(3), «++ , ag (3)] 
-X[Ha(w1), H, (1), H,(—w2) Jaza_,exp[j (201 7 we )t]}, 


where Re{ } stands for the real part of {}. Since H, and H, have a 
direct interpretation in terms of the structure of the network of Fig. 
Zs so does T(o1, W2, 1, a2). 

As a matter of convenience we have chosen to let ./ take into 
account the linear parts of the nonlinear resistors. We could have 
assumed instead that .”, without these linear parts, has sufficient 
damping that our conditions on h,, hy, h., ha, u;, and uz are satisfied. 
Under some very reasonable assumptions (see Ref. 10, comments on 
p. 694 concerning H.3), our expression for 7'(w1, we, a1, @2) would then 
explicitly exhibit its dependence on these linear parts, and this may 
be of interest in some cases. It can be shown, using a result in Ref. 10, 
Section 2.4.3, that the alternative expression for T'(w1, we, a1, dz) that 
we would have obtained is 


Re{H, (20, — we)F(2w1 — we)diag[a (3), «++ , a(3)] 
-X[E (w1)Ha(w1), E(w) Ha(w1), E(—we)Ha(—w2)] 
-aja_2exp[j (2w1 — w2)t]}, 
in which, with 1, the identity matrix of order q, 
F(w) = {1, — diagla;(1), +++ , @q(1)]}He(w)}™; 
and 
E(w) = {1, — H.(w)diaglai(1), +++ , a(1)}}~ 
(with both inverses existing for —« < w < 0), 


2.4 Discussion 


In this paper we have derived and discussed a general expansion for 
the response of a cochlear model having a nonlinear membrane. The 
nonlinearities of the model take into account the membrane’s nonlin- 
ear damping. Of particular interest is the third-order term in the 
expansion for the case described in Section 2.3, in which the input is 
a sum of sinusoids at frequencies w, and w2. This term is the first term 
in the expansion that gives rise to a component at the frequency 
(21 = we). 
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The expression for the third-order term is seen to depend on two 
transfer-function matrices H, and H,, where H, relates the output 
response vector r to the voltages across the resistors in Fig. 2 under 
the condition that ey is zero, and H, relates the currents through the 
resistors to the input voltage eo under the condition that the resistors 
are replaced by short circuits. 

In the expression for T, the transfer function H; is evaluated only at 
(2w, — we). The function H, has the interpretation that it corresponds 
to a filter that alters the distortion products after their generation on 
the basilar membrane. 

The terms a1(3), --+ , a@g(3) are measures of the generator strength 
of the nonlinear distortion as a function of position, in the sense that 
each a;(3) is proportional to the coefficient of the cubic term in the 
power series expansion of the resistor function R;,. Cubic nonlinearities 
have been used previously in basilar membrane models to model the 
generation of distortion products. 

The transfer function H, enters the expression for T in a particularly 
interesting way. Notice that any element of J, say the /th, is the real 
part of 


x [Hp (201 — we) )moe (3)[Ha (wr) Je[Ha(—we))najare7Pr-o, 


which is a linear combination of g terms with, so to speak, H, appearing 
three times in each term, twice for w; and once for w». 
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APPENDIX 
Proof of the Main Steady-State Response Result; Recursive Relations for the 
Bin 

Theorem 3 of Ref. 10 would be directly applicable to the network 
governed by (1), (2), and (3) if ha(t — 7), ho(t — 7), h(t — 7), and 
hag(t — 7) were square matrices of the same size. Since this condition 
is not met, we proceed to construct a suitable related set of system 
equations. 

Let n = (q + p + 2), and define K,, K,, K., and Kg to be the 
convolution operators associated with h,, hj, h,, and hg, respectively. 
Let v, x, y, and w be given by v = (€o, Ui, U2)", x = (i, xPI)*, y 
(e, yP*2l)* and w = (r, w'%*"))*, where “tr? denotes transpose, and 
xt?) ylP+2] and w'4*4 are unspecified vector-valued functions (on t > 
0) of the indicated dimensions. Notice that v, x, y, and w are all n- 
vector valued. Finally, let n,(k = 1, --- , n) be the functions defined 
by n, = R,(1 < k <q), with 7, equal to the zero function for (q+ 1) < 
ken. 

Consider the equations 


x = Av + Cy (7) 
w = Dv + By (8) 
y = Nx (9) 


in which by (9) we mean y,(t) = ,[x,(t)] for each k and ¢, and in 
which A, C, D, and B are given in partitioned form by 


-( K, 1(q) Z(q, p + 1) ) 
Z(p+2,1) Z(p+2,q) Z(pt+2,p+1)/’ 


C= Ce Z(q, p + 2) 
Z(p + 2,9) Z(p + 2, p + 2) 


iia Z(p + 1, q) I(p + 1) 
Seca) eet 
and 
~\Z(q+1,q) Z(q+1,p + 2) 
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where, for any positive integers g and s, I(q) denotes the identity 
operator on the space of g-vector valued functions on t = 0, and Z(q, s) 
is the zero operator from the space of s-vector-valued functions on t 
= 0 into the corresponding space of functions whose values are of 
dimension g. 

We see that if (7), (8), and (9) are satisfied, then (1), (2), and (3) 
are met, and that if the latter set of equations are satisfied and x”*7I, 
ylp*2] and w'%*4 are zero functions, then (7), (8), and (9) are satisfied. 
Using the fact that A, B, C, D, and N meet the conditions of Theorem 
3 of Ref. 10, it follows from that theorem that statements 1 and 2 of 
Section 2.2 hold.* It also follows from the theorem that r,, is inde- 
pendent of u,, U2, and us, and using the relation w = (r, w'%*")*, that 
rss(t) can be written in the form (5) with the components of each 
[r.s(-)]m elements of AP, with 


[7ss(t)]1 = ») Ha (wr)aner" 


and each [r,;(t)]m for m = 2 specified as follows (after some straight- 
forward analysis involving partitioned matrices). 

With c,, co, --- arbitrary n-vectors, and §,, 82, --- arbitrary real 
numbers, let g-vector-valued functions Q,, Qo, --- and (p + 1)-vector- 
valued functions P2, P3, --- be defined by Q:(c1, 81) = Ha(B1)(e1)1, 

Qm(e1, s+ Cm, Bi, DO ay Bm) = H.( By test + Bm) Sm 


for m = 2, and 
Pin (ci, -++5 Cm, Bi, +++, Bm) = Ab(Bi + aad + Bn) Sm 


for m = 2, in whicht 


m 


Sn= SU) LY diaglor(1), «++, aq(l)] 
l=2 kyt-+-+k=m 


-x[Qe, (C1, ena Cky> Bi, ame Br) aia | 
Qz,(Com—rp1)> vee Cm, Bim—ke+1)> Lvs Bm)], 
(c,); is the first component of c¢;, 


d'R 
au) Gah) 


for each 1, “diag” indicates a diagonal matrix, and x is defined by the 


a,(l) = 


* The “neighborhood condition” of statement 1 is inherited from Ref. 10, part (iib) 
of Theorem 3 via the relationship between (1), (2), and (3) and (7), (8), and (9). 
In the expression for Sn, Dy,+...4z<m denotes a sum over all positive integers ky, 
, & that add to m. kyo 
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condition that x[c1, --- , c:] is the g-vector with kth element (c,), --- 
(c:),(1 < k <q). In terms of these P,,, we have 


oO 


[rss (t) ]m = > ae > Pr(dix; Se ces din» Why *°*% 5 Wp, Jed er tHe gE 


ky=-0 Ryze 


where d;, = (az, 0, --- , 0)" for each k. 
Observe that for any m and k with 1 < k < m, each Q,, and each 
Pm is linear in c, and independent of (c,),; for 1 = 2. Thus, each 


Qn(er, sey Cm, Bi, paar Bm) is equal to Qn(u, vee SU, Bi, TA ET 2g Bm) (C1)1 
-++ (Cm)1, Where u = (1,0, --- , 0)", and similarly for the P,,. Therefore, 
if A, and &,, are defined by 


Si(b:) = H,(6,), (10) 
Piprict SA Opa aG y (It) 


Y — diag[ai (1), +++, ag(1)] 


byte bhy=m 


XE, (81, cams Bri)s aaa | Si Biaick says eee Bm) (11) 


for m = 2, and 


Bm(Br, ar) Bm) = Ai, (pi se aa + Bn) 


>) (7 -Y diagla(), «+, ag(2)] 


Ryt+ svat +k=m 
hj>0 


-X[r, (81, Steg Bry), Pores Sr (Bom—kt1)s ae Bm)] (12) 
for m = 2, we have 


oo fo) 
[rss(t) ]m = oy eee > Br (Wr, 5 ceey We, Ak, Saas Ay, C7 mt Fmt 


Rp=—© 
—a< {<0 


for m = 2, 3, --- . This completes the Appendix. 
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A single-chip implementation of Linear Predictive Coding (LPC)-based 
feature measurement for speech recognition, called the Feature Extracting 
Digital Signal Processor (FXDSP), has been developed by programming the 
AT&T DSP20™ programmable Digital Signal Processor (DSP) and has been 
verified by both numerical simulation and system use. For identical input, the 
recognition distance between floating point simulation and the DSP imple- 
mentation was found to be negligibly small when compared with distances for 
word matches. The feature-measurement technique is identical to that used 
in numerical simulations of LPC-based isolated- and connected-word recog- 
nition using combinations of dynamic time warping, vector quantization, and 
hidden Markov modeling. As a result, the FXDSP represents a single-chip 
common building block for real-time implementation of most speech recogni- 
tion techniques under investigation at AT&T Bell Laboratories. The FXDSP 
performs eighth-order LPC analysis on speech received from a standard 
CODEC. In every frame period (15 ms) it produces a feature vector consisting 
of the log energy, nine amplitude-normalized autocorrelation coefficients, and 
nine LPC-based test-pattern coefficients. The feature-measurement program 
requires 1023 locations of the 1024 available in on-chip program ROM, 211 of 
256 available RAM locations, and 75 percent of available real time. 


l. INTRODUCTION 


Most speech recognition work at AT&T Bell Laboratories has been 
based on a standard form of feature measurement first proposed by 


* AT&T Bell Laboratories. ' AT&T Bell Laboratories, now with Texas Instruments. 
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Itakura.’ Speech recognition features are computed from an eighth- 
order Linear Predictive Coding (LPC) calculation on 45-ms analysis 
frames spaced by 15 ms. The autocorrelation method is used, and the 
speech is of telephone bandwidth (3.3 kHz) and is sampled at 6667 
samples per second. 

With this front end, numerical simulations have demonstrated 
successful word recognition algorithms based on dynamic program- 
ming for both isolated words” and connected words.? Real-time hard- 
ware that uses this front end for isolated-word recognition has been 
reported.* More recent simulations have used the same front end in 
recognizers that use vector quantization for isolated-> and connected- 
word recognition® and for recognizers using hidden Markov modeling.’ 

Comparative tests of the LPC front end with a variety of filter 
banks have found the LPC technique to provide superior performance 
for complex vocabularies over telephone bandwidths.® 

This paper describes a real-time implementation of this LPC fea- 
ture-measurement technique that is of single-chip complexity. The 
implementation uses a programmable signal processor, the AT&T Bell 
Laboratories Digital Signal Processor (DSP).? In this implementation, 
called the FXDSP (Feature Extracting Digital Signal Processor), the 
output continuously provides results of LPC analysis of whatever 
input signal is present with less than one frame (15 ms) of delay. 


1.1 Relation to previous work 


An implementation of LPC analysis using two DSP chips was 
previously described by Daugherty.?° This used an older version of the 
programmable signal processor known as DSP-1. The DSP-1 operates 
at one half the speed (5-MHz clock) and has one half the RAM (128 
20-bit words) as the DSP20™ signal processor used here, but has the 
same size program memory (1024 16-bit words). Thus, one DSP20 
signal processor is equivalent to two DSP-1 processors in speed and 
RAM, but is the same as one DSP-1 in program memory. 

A major challenge of the work reported here was to reduce the 
program size by a factor of 2 to attain single-chip implementation. A 
second challenge was to combine two separate time scales, that of the 
input (150 ws) and that of the output (15 ms), which had previously 
been separated by two DSP-1 processors, into a single processor, the 
DSP20 signal processor. 

A microprocessor-based implementation of an isolated-word recog- 
nizer had partitioned the feature-measurement task between a slower 
general-purpose 16-bit microprocessor performing decision operations 
and a faster, special-purpose two-board signal processor performing 
high-speed repetitive arithmetic.* This arrangement is similar to the 


1788 TECHNICAL JOURNAL, OCTOBER 1985 


original simulation environment of a minicomputer and array proc- 
essor. 

An implementation of 10th-order LPC analysis has been developed 
for the TMS320* signal processor.’ The TMS320 signal processor 
uses a sampling rate of 8 kHz, a frame size of 30 ms, and a frame 
period of 20 ms. This combination of frame size and period results in 
a frame overlap of 33 percent, where each sample contributes to an 
average of 1-1/2 frames. The DSP implementation described here uses 
a frame overlap of 67 percent, and thus requires three frames of 
computation to be completed on each sample. However, an increase 
in recognition error rate accompanies the reduction in computation 
obtained by a reduction in frame overlap, as shown by numerical 
simulation.’ In the TMS320 signal processor implementation, the 
same circuit also performs pattern matching for connected-word 
recognition. 

In addition to realizations based on programmable signal processors, 
architectures for single-chip LPC feature extractors that use a custom- 
built processor have been described.”® 


1.2 Organization of paper 


In Section II, we examine the equations of LPC feature measure- 
ment. Section III describes the DSP chip and the external circuitry 
required to do the feature measurement. Section IV describes the 
architecture of the FXDSP program, and Section V presents some 
details of program implementation. In Section VI, the comparison of 
the real-time FXDSP calculation with a floating point simulation is 
described. 


Il. LPC FEATURE MEASUREMENT 


The requirement of LPC is to determine a unique set of predictor 
coefficients, a,, kK = 1, 2, --- , p, that minimize the sum of squared 
differences, E,, between actual speech samples, s(n), and approxi- 
mated speech samples, §(n). The approximated speech samples $(n) 
are formed from a linear combination of speech samples over a short 
segment of the speech waveform. Thus, the approximate speech sam- 
ples are given by 


p 
§(n) = Pp a,s(n — k), (1) 
=1 
where p = 8 in this analysis. The task of minimizing the prediction 
error, E,,, is to choose a; such that 
* Trademark of Texas Instruments. 
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E, = ¥ e2(m) (2) 


[s,(m) — §,(m)] (3) 


3M 3 


> [sn(m) — > a;s,(m — k)}? (4) 


is a minimum. . 

Techniques for calculating the linear prediction coefficients, az, 
from the speech samples, s(n), are described in the literature.'* The 
method used here is a block-processing technique based on the auto- 
correlation method and Durbin’s recursion (Fig. 1). 

Speech which has been bandlimited to 100 to 3300 Hz and sampled 
at 6667 samples per second is first preemphasized with a first-order 
network: 


s’(n) = s(n) — as(n — 1); a= 0.95. (5) 


The preemphasized speech is then blocked into frames of 300 
samples (45 ms) which are spaced by 100 samples (15 ms). Thus, the 
Ith frame of speech, x;, is given by 


x, = s’(Ml+n),n=0,1,---,N-1]; 1=0,1,:--,L-1, (6) 


where M = 100 and N = 300 for an input sequence length of L frames. 
As a result of this choice of M and N, each speech sample contributes 
to three consecutive analysis frames. 

Each frame is then smoothed by a Hamming window: 





x(n) = w(n)-x(n), (7) 
2 
w(n) = 0.54 — 0.46 cos(————}, N= 300. (8) 
N-1 
| a= 0.95 M=100 W = 300 w(n) p=8 










x(n) 
/ 


/ 
‘ AUTO- 
CORRELATION 
ANALYSIS 


Fig. 1—Signal processing for extracting LPC features for recognition. 
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The resulting windowed frames of speech data are used to perform 
an autocorrelation calculation, given by 
N-1-m 
Rim)= XY xuln)u(n+m); m=0,1,---,8 (9) 
The logarithm of the frame energy, R;(0), is then calculated: 
r,; = log.R,(0). (10) 


The autocorrelation coefficients are gain-normalized such that 
Ri(0) = 1, as follows: 


(11) 


This normalization is required so that later computation of Durbin’s 
recursion uses the full integer precision of the machine. The log energy, 
r,, is used for end-point detection and frame energy information during 
the recognition process. 

Durbin’s recursion is then applied to calculate a set of PARCOR 


coefficients, k;, 1 = 1, 2, ---, 8, and a prediction residual from the 
Ri(m) for each frame as follows (the frame index | is suppressed): 
E® = R’(0). (12) 


For i= 1, 2, --- , 8, do eqs. (13) through (16): 


i-1 
Eizo = al RG=j) 
jel 


ki = Ee») (13) 
ai = k; (14) 
a =as ke; (§=1,2,---,i-1; i¥#1) (15) 
E® = (1 — R7)E. (16) 
Extract final residual, E, and LPC coefficients a;: 

E=E® (17) 
aaa”. (18) 

Test-pattern coefficients are then formed by computing: 
Vit) = =O), m=0,1,---, 8. (19) 


The FXDSP output consists of r;, R/(m), and V;(m) for m = 0, 1, 
--.», 8. The PARCOR coefficients k; and LPC coefficients a; are 
calculated as a result of calculating EH; however, since they are not 
used directly in real-time pattern matching, they are discarded. Ref- 
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erence templates are made up of autocorrelations of a; that are pro- 
duced during a non-real-time vocabulary training session. In the 
current robust training algorithm, a reference pattern is made up of 
an autocorrelation average of two tokens that correspond to two 
different repetitions of a word.’° Therefore, no use can be made of the 
LPC coefficients in real time. 


Ill. HARDWARE 


The hardware for this implementation consists of a ny-law CODEC 
with filters, which is run at a 6.667-kHz sampling rate, and the AT&T 
Bell Laboratories DSP, which is run at 10 MHz (Fig. 2). Separate 
oscillators control the sampling rate of the CODEC and the clock of 
the DSP. 

A design alternative would have been to replace the 8-bit y-law 
CODEC with a 12- or 13-bit linear analog-to-digital converter. Al- 
though a slight amount of quantization error is introduced by the p- 
law conversion of the CODEC followed by the conversion back to 13- 
bit linear representation in the DSP, this error was seen to be minor. 
The benefit of the economy of a simple hardware interface between 
the DSP and the CODEC, the lower cost of the CODEC as compared 
with a 18-bit linear converter, and the fact that any telephone line 
input to the CODEC has probably already been subjected to conver- 
sions from analog to u-law digital and back justified the slight degra- 
dation of waveform. 

A block diagram of the DSP is shown in Fig. 3. The version used 
here, known as the DSP20 signal processor, is an improved version of 
the original signal processor described in Ref. 9 in which both speed 
and RAM size have been doubled. 

The DSP20 signal processor has a 400-ns instruction cycle time. 
The processor consists of a read/write memory of 256 20-bit words 
and a mask-programmable program ROM of 1024 16-bit words. Alter- 
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Fig. 2—LPC feature measurement hardware. 
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Fig. 3—Block diagram of DSP. 


natively, the DSP can be run from 1024 words of external program 
memory, usually made of erasable programmable ROM, or RAM that 
can be down loaded. An address arithmetic unit contains registers for | 
controlling memory access. A data arithmetic unit contains a 16-bit x 
20-bit multiplier, a 40-bit accumulator, a 40-bit adder, and a 20-bit 
rounding-overflow circuit. Input and output occur through two serial 
data pins. 

In one 400-ns machine cycle, the DSP can (1) decode an instruction, 
(2) fetch data and perform a multiplication, (3) accumulate output 
products from the multiplier, and (4) store data in memory. 


IV. PROGRAM ARCHITECTURE 


A conflict arises between the input time scale of the FXDSP, one 
sample every 150 us, and its output time scale, 19 coefficients of a 
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feature vector every 15 ms. The FXDSP is required to process a new 
sample every 150 us regardless of any other operation in progress, else 
the input sample is lost and the resulting frame feature vector is 
incorrect. Thus, two time scales exist, a sample time scale and a frame 
time scale. 

As a result of the two time scales, the program architecture really 
consists of two separate programs, a sample update program that 
updates autocorrelation vectors every four samples [eqs. (5) through 
(9)] and a frame-recursion program that calculates the output feature 
vector from the autocorrelation vectors from the previous frame [eqs. 
(10) through (19)]. The frame-recursion program is divided into 
smaller pieces that are interposed with repeated executions of the 
sample update program (Fig. 4). 

The sample update program operates on four samples each time it 
is executed. This four-sample operation is a compromise between fully 
block processing, in which autocorrelation vectors are calculated on a 


Vy 


<a—— TIME 





Fig. 4—Interleaving of sample update and frame-inversion programs. 
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frame of 300 samples all at once, and fully stream processing, in which 
the autocorrelation vectors are updated upon receipt of each new 
sample.’° Fully block processing, however, requires enough read/write 
memory to store all 300 samples, which is more memory than the 
single-chip DSP has available. Fully stream processing, which has 
been used in other implementations,*"! executes too slowly for real- 
time analysis on the DSP.” This is because before any autocorrelation 
update occurs, address pointers must be set up for accessing samples 
and autocorrelation vectors, and each autocorrelation coefficient must 
be accessed and placed in the accumulator of the arithmetic unit. 
These overhead operations. are necessary for any number of samples 
used in the update, and can only be tolerated in real time if the updates 
occur for more than one sample at a time. 

The frame period of 100 samples and the updating of autocorrelation 
vectors by four samples at a time require that the update program be 
executed 25 times per frame period. Therefore, an output operation of 
one frame coefficient is added to the sample update program to provide 
25 output coefficients per frame, spaced at four sample intervals. The 
19 frame coefficients (r,, R/(m), and V;(m), m= 0, 1, --- , 8) and six 
consecutive zeroes are output for each frame. The sequence of six 
zeroes provides a synchronization marker for identification of the 19 
coefficients by the processor that receives the output of the FXDSP. 

Figure 5 shows a more detailed view of the timing of operations. 
The frame recursion is divided into 25 pieces numbered LPC(0) 
through LPC(24). Between the first and second samples of the group 
of four sample inputs, one piece of the frame recursion program is 


99 


SAMPLE 01234 567ee8 ee+ 94 96 98| 0 
; ~ 
LPC (0) /LPc (1) LPC (2) "LPC (3) LPC (23) LPC (24) 
/ one 
/ em 
NO.1,/ aro 
\/ OUTPUT oe OUTPUT 
INPUT n Baa n+ 
SAMPLE ~ 





READ READ READ READ 
NO. 1 NO. 2 NO. 3 NO. 4 
FRAME SAMPLE UPDATE 
INVERT (n) 


TIME ———+ 


Fig. 5—Timing of input, output, and program sections. 
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executed. Each piece is completed within the sample period of 150 us. 
In Table I, the function of each piece of the frame recursion is shown, 
as well as the time required for execution and the number of 16-bit 
words in program ROM required for that piece. As shown at the 
bottom of Table I, the final 11 time slots, LPC(15) through LPC(24), 
are unused. 

During the time following the second, third, and fourth samples, the 
sample update operation is performed. The operations associated with 
the sample update operation are described in Table II. A sample is 
available every 150 us and is placed in the FXDSP input buffer by the 
CODEC. The update program reads that sample at a convenient time, 
but before the next sample, arriving 150 ys later, overwrites it. Each 
sample is immediately converted from p-law to linear encoding by the 
FXDSP and is then written into a four-sample buffer without any 
further processing until all four samples are obtained. 


Table |—Frame recursion timing and program memory (by function) 


Execution Program 
Label Function Time (ys) Locations 
LPC (0) Read R;(m) to frame recursion in- 95 97 
put buffer, shift window 
LPC (1) Calculate r; 60 143 
LPC (2) - Calculate R/(m); m = 1, 2, 3, 4 144} 50 
LPC (8) Calculate R/(m); m = 5, 6, 7, 8 144 8? 
LPC (4) Set PA eo recursion [Ep 50 32 
= R; 
LPC (5) Calculate 1/E;-; and Durbin’s re- 128 226 
cursion (1 = 1) 
LPC (6) Calculate 1/E;-, and Durbin’s re- 128 6? 
cursion (1 = 2) 
LPC (7) Calculate 1/E;-; and Durbin’s re- 128 6 
cursion (i = 3) 
LPC (8) Calculate 1/E;-, and Durbin’s re- 128 6 
cursion (i = 4) 
LPC (9) Calculate 1/E;-, and Durbin’s re- 128 6 
cursion (1 = 5) 
LPC (10) Calculate 1/E;-, and Durbin’s re- 128 6 
cursion (1 = 6) 
LPC (11) Calculate 1/E;-, and Durbin’s re- 128 6 
cursion (1 = 7) 
LPC (12) Calculate 1/E;-, and Durbins’s re- 128 6 
cursion (i = 8) 
LPC (13) Calculate 1/E 128 6 
LPC (14) Calculate V;(m), m= 0,1, ---,8 12 41 
LPC (15) thru Tdle 12 each 26 
LPC (24) 
Total (% used of available) 1777 (12%) 688° (67%) 


1 For signal 51 dB down from peak; shorter execution time for stronger signals. 
? Locations include only the subroutine call; subroutine previously counted. 
3 Total includes 17 locations of the power-up initialization routine not listed above. 
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Table II—Sample update timing and program memory (by function) 


Execution Program 
Label Function Time (ys) Locations 


Read #1 Read and y-to-linear convert 3 11 
sample 


Output Output one frame feature coeffi- 6 29 
cient 

Read #2 Read and p-to-linear convert 
samples 

Move and pre-emp Shift sample buffer by four sam- 60 54 
ples and preemphasize four 
samples 


Window Calculate window values and ap- =_111 123 
ply three times to four samples 


Read #3 Read and yu-to-linear convert 
sample 

Autocorrelation Use four samples to update nine 193 118 
autocorrelation vectors for 
three overlapped frames 


Read #4 Read and p-to-linear convert 
sample 


Total (% used of available) 337 (62%) 335 (33%) 


As a result, the sample update program has a pipeline delay of four 
samples. The frame recursion program calculates on the frame just 
completed and produces the output of a feature vector within one 
frame period after the end of the corresponding frame. 


V. PROGRAM IMPLEMENTATION 


This section describes several novel programming techniques that 
were required to implement the FXDSP. The most scarce resource 
was program memory; execution time and read/write memory were 
available in sufficient quantities. Therefore, most innovations were 
directed toward reducing the amount of program memory required at 
the expense of increasing execution time or read/write memory re- 
quirements. The specifics of program module size, execution time, and 
execution sequence are covered in Tables I and II. 

One major problem, the negotiation between the input sample time 
scale of 150 us and the output frame time of 15 ms, was solved by the 
program architecture discussed in the previous section. 

A second problem was the Hamming window computation. Because 
of the frame size and overlap, each sample falls into the first third of 
one analysis frame, the second third of the previous analysis frame, 
and the final third of the twice previous frame. Additionally, after 
every 100 samples—when one of the three frames is completed—the 
relationship of the three analysis windows rotates cyclically. As a 
result, the Hamming window presented both the problem of producing 
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the cosine-based values and rearranging the segments of the window 
upon completing a frame. 

In earlier implementations,*”° the Hamming window was stored as 
a table in program memory. In this implementation, program memory 
was too scarce, so a Taylor series expansion was used instead. Each 
third of the Hamming window (100 samples) was computed from a 
third-order Taylor series expansion about its midpoint (sample 50, 
150, and 250). A comparison of the exact and approximate Hamming 
window is shown in Fig. 6 in both the time and frequency domain. 
The approximated window has been slightly shifted up to each com- 
parison—its peak value is actually identical to the peak of the exact 
window. 

To conserve program memory, several pieces of program modules 
were shared for multiple functions, sometimes with multiple exit 
points. For example, to perform the division required by eq. (13), the 
reciprocal of the energy E was calculated. An efficient reciprocal 
routine developed by Daugherty’® was used, but required that the 
number for which the reciprocal was being formed be between 1 and 
2. To build a general-purpose reciprocal routine, the number was first 
normalized to fall within the desired range. The reciprocal was re- 
adjusted to its true value to compensate for the normalization. The 
reciprocal normalization is the same operation as the amplitude nor- 
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Fig. 6—Comparison of exact and Taylor series approximation of Hamming window. 
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malization and log energy calculation performed at LPC(1) [eq. (10)], 
and the same program performs both functions. However, for ampli- 
tude normalization, the program is exited before the reciprocal calcu- 
lation is executed. Thus, the reciprocal routine, which requires 181 
program locations, shares 99 of these locations with the gain normal- 
ization program, saving on overall program space. 

An important way to conserve program space was the development 
of a means of computing Durbin’s recursion with a common piece of 
code for all orders, i = 1, 2, --- , 8. Although the recursion is readily 
executed as a subroutine in microprocessor and Fortran implementa- 
tions, it is difficult to perform as a subroutine in a programmable 
signal processor. This is because a DSP does not allow enough address- 
ing capability to handle the two-dimensional array of a and the one- 
dimensional arrays of k, E, and R. A DSP typically provides only 
indirect addressing with the ability to increment one of two or three 
pointer registers by a fixed amount. An implementation of Durbin’s 
recursion, if strung out, requires 536 program locations (not including 
the reciprocal calculation). With the iteration-independent form used 
here, that figure drops to 119 program locations. 

By careful assignment of memory locations and proper sequencing 
through the arrays a, k, E, and R, all address calculation was rendered 
to be sequential within one iteration, that is, only in increments or 
decrements of one location.’” This type of address sequencing is within 
the capability of the signal processor, and makes possible the single 
subroutine for all iterations. This allowed the frame-recursion program 
and the sample update program to fit together in the 1024 locations 
of program memory. 

The LPC test coefficients V;(m) produced by the recursion are 
scaled by a power of 2 before output to obtain V;(m): 


Vi(m) = 2-™. Vi(m), (20) 
where 
n(m) = 0; m = 0, 1, 2, 3 (21) 
se m=4,5 (22) 
= 9. m=6 (23) 
= 3: m=7 (24) 
=4: m= 8. (25) 


This scaling is to compensate for a scaling performed on reference 
coefficients by a factor of 2”” to allow each reference coefficient to 
be represented in 12 bits of memory. The values n(m) are based on 
statistical analysis of the dynamic range of reference coefficients.” 


SPEECH RECOGNITION = 1799 


To conclude the examination of program implementation, it is 
important to examine the arithmetic precision used in the signal 
processing. The u-law speech is immediately converted to 13-bit linear 
encoding and is multiplied by 32 to attain an 18-bit word length. All 
sample update processing before autocorrelation—that is, eqs. (5) 
through (8)—is performed with 16 bits of precision, with the only 
approximation being introduced by the Taylor series expansion of the 
Hamming window. The autocorrelation calculation, eq. (9), is per- 
formed with 34 bits of precision, which represents full accuracy for 
the 13-bit speech samples. Double-precision storage is used on the 34- 
bit autocorrelation vectors. 

A completed frame of autocorrelation vectors is normalized and then 
truncated to 15 significant bits [eqs. (10) and (11)]. This allows the 
remaining LPC recursion to be computed on single-precision data. 
Fifteen-bit precision has been shown to be adequate for fixed-point 
implementation of Durbin’s recursion.’® The LPC recursion [eqs. (12) 
through (16)], including the reciprocal calculation, is computed to at 
least 16 bits of precision. Often, for computations such as the accu- 
mulation of sums, eq. (13), the full 40-bit accumulator is used before 
rounding the sum to the single-word size. 

As a result of maintaining full precision throughout the calculation, 
the difference between the LPC calculation, as computed by the 
FXDSP and as computed by full-precision floating point simulation, 
is minimal, as will now be described. 


VI. COMPARISON WITH FLOATING POINT SIMULATION 


To evaluate the performance of the FXDSP, a comparison of LPC 
feature measurement as calculated by the real-time FXDSP hardware 
was compared to LPC feature measurement as calculated by a floating 
point Fortran simulation running in non-real-time. The input to both 
routines was a common file of digitized speech, and final comparison 
was made using the log likelihood spectral distance used in speech 
recognition. This allowed relative comparison of errors introduced by 
the FXDSP to typical speech recognition scores. 

The two-path program flow is shown in Fig. 7. Input at the left is a 
linear-encoded, 16-bit-per-sample speech file that had been band- 
limited to 3.2 kHz and sampled at 6667 samples/s. The program 
module FORMAT produced two speech files, one in format suitable 
for down loading into a DSPMATE—a hardware development tool 
for the AT&T Bell Laboratories DSP—and the other a standard 
integer speech file for Fortran simulation. Because the FXDSP is 
intended for use with a y-law CODEC, one step in the DSPMATE 
formatting is the conversion of the speech file from linear to p-law 
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Fig. 7—Program module architecture for FXDSP verification. 


encoding. The FXDSP immediately converts each sample back from 
p-law to linear. 

The upper path represents the route through the DSPMATE. The 
file capable of being down loaded is sent to the DSPMATHE, where it 
is presented as file input in real time to a DSP chip running the LPC 
feature-measurement program. The resulting outputs, consisting of 
log energy 7, gain-normalized autocorrelations R/(m), and LPC test- 
pattern coefficients V;(m), m = 0, 1, ---, 8, are then up loaded, 
reformatted for Fortran simulation (FORMAT2), and input to a log 
likelihood distance computation program (DIST). The tilde over a 
quantity indicates that it was calculated by the FXDSP. 

The lower route from FORMAT is passed through a floating point 
computation (FXFLOAT) that produces the values of r,, Rij(m), and 
Vi(m) in a file that is in a format identical to that produced by 
FORMAT2. 

Program DIST computes the log likelihood distance of Itakura’ for 
test coefficients produced by the FXDSP and reference coefficients 
produced by the floating point simulation. Reference coefficients 
F,(m) are produced from the LPC coefficients of eq. (18) as follows: 


8 
F,(0) = ¥ a3 (26) 
j=0 
8-—m 
F,(m) = 2-27). YY ajdjam} m= 1,2, ---,8. (27) 
j=0 


The values of n(m) are given in eqs. (21) through (25). 
The distance calculated by DIST is for test and reference frames 
taken from the same sequence of speech samples and is given by 
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8 
di = log & Fi(i)- Vi(i). (28) 

For test and reference coefficients computed with full precision from 
the same speech samples, d, = 0. 

The comparison was performed on 161 frames of speech taken from 
spoken digits. The dynamic range of the speech was 38 dB. 

In Fig. 8a, a histogram of distances computed according to eq. (28) 
is displayed. The negative distances are a normal result of taking the 
log of a quantity that is slightly less than 1 due to round-off error. 
The average of the distances is 0.021. 

This distance is negligibly small compared with the distances asso- 
ciated with the variation in word pronunciation shown by scores for 
correct word recognition. In Fig. 8b, the error histogram of Fig. 8a is 
overlaid on the histogram for correct word recognition using the same 
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Fig. 8—Comparison of LPC distance from FXDSP to distance of correct word 
matches. 
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distance measurement.’? The average of the correct recognition scores 
is about 0.45. 

The distance of the FXDSP calculation from the true floating point 
computation is significantly less than the distance arising from vari- 
ation when a given analog waveform is digitized at randomly varying 
phase. This distance, measured by repetitively playing taped speech 
with a professional quality recorder into a digital speech recognizer, 
averages about 0.035.” 

The major source of error between floating and FXDSP computation 
arises from the linear-to-y-law-to-linear conversion that is performed 
on the FXDSP path through Fig. 7, but not on the floating point path. 
Table III shows a sequence of particularly large distances that con- 
tributed to Fig. 8 in the left column. In the right column are the much 
smaller distances that result from performing linear-to-u-law conver- 
sion, followed by u-law-to-linear conversion, on the speech at a point 
immediately preceding the floating point LPC analysis (FXFLOAT). 
The average distance here drops from 0.09 to 0.012. A preliminary 
investigation on more speech frames suggests that about 75 percent of 
the distance between floating point simulation and FXDSP imple- 
mentation is because of the linear-to-y-to-linear conversion. 


VII. SUMMARY 


A single-chip basic building block for LPC-based connected- and 
isolated-word recognition systems has been described. The single chip 
is an appropriately programmed digital signal processor of AT&T Bell 
Laboratories. 

Because the major limitation in attaining single-chip implementa- 
tion was the amount of program memory available, several novel 
programming techniques were used to conserve program memory. 
These included (1) development of a program architecture that inter- 
leaved a background mainframe inversion program with a foreground 


Table III—Comparison of FXDSP-to- 
floating point distances—with and 
without linear-u-law-linear conversion in 
floating point computation 


Frame Without With 
1 0.100 0.013 

2 0.147 0.008 

3 0.055 0.035 

4 0.226 0.008 

5 0.052 0.020 

6 0.010 0.001 

7 0.009 0.002 
AVG: 0.090 0.012 
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sample update program, (2) development of a form of Durbin’s recur- 
sion suitable for implementation as an iteration-independent subrou- 
tine, (3) use of overlaid subprograms with multiple exit points, and (4) 
use of a Taylor series expansion, rather than a look-up table, to store 
and permute segments of a Hamming window. 

Comparison with numerical simulations shows that the error intro- 
duced by the implementation is negligible. This good match renders 
the chip suitable for use in systems that use quantities calculated in 
floating point on general-purpose computers, such as statistically 
clustered templates or frames for speaker-independent work recogni- 
tion or for recognition based on vector quantization or hidden Markov 
modeling. 
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This paper analyzes a mathematical model of a blocking system with 
simultaneous resource possession. There are several multiserver service facil- 
ities without extra waiting space at which several classes of customers arrive 
in independent Poisson processes. Each customer requests service from one 
server in each facility in a subset of the service facilities, with the subset 
depending on the customer class. If service can be provided immediately upon 
arrival at all required facilities, then service begins and all servers assigned to 
the customer start and finish together. Otherwise, the attempt is blocked (lost 
without generating retrials). The problem is to determine the blocking prob- 
ability for each customer class. An exact expression is available, but it is 
complicated. Hence, this paper investigates approximation schemes. 


I. INTRODUCTION AND SUMMARY 


The multifacility blocking problem considered here arises in many 
contexts and has a long history in traffic engineering (see pages 77 
and 95 of Ref. 1). We were motivated by performance analysis issues 
in packet-switched communication networks. Specifically, we were 
investigating methods for calculating the blocking probabilities (per- 
centage of failed attempts) in setting up virtual circuits. The need for 
methods to calculate these blocking probabilities arose in the devel- 
opment of the Packet Network Performance Analysis module of the 
Packet Network Design and Analysis (PANDA) software package in 
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the Operations Research Department of AT&T Bell Laboratories.”* 
It is difficult to analyze the blocking because a circuit typically requires 
the simultaneous possession of limited resources associated with sev- 
eral different facilities (transmission links, memory buffers, etc.). 
Moreover, there is competition for the resources not only from other 
demands for circuits on the same path, but also from demands for 
different circuits that use only some of the same facilities. Hence, even 
without alternate routing or waiting (which we do not consider), the 
blocking is complicated. Our purpose here is to develop bounds and 
approximations. After we describe our model, we will discuss related 
work and other applications. 


1.1 The mathematical model 


There are n multiserver service facilities without extra waiting room 
and c customer classes. Service facility i has s; servers. Customers from 
class j arrive according to a Poisson process with rate \; and immedi- 
ately request service from one server at each facility in a subset A; of 
the n service facilities. If all servers are busy in any of the required 
facilities, the request is blocked (lost without generating retrials). 
Otherwise, service begins immediately in all the required facilities. All 
servers working on a given customer from class j start and free up 
together. The service time for class j at all facilities has a general 
distribution with finite mean yj’. We assume that the c arrival proc- 
esses and all the service times are mutually independent. 

This model already embodies the extension in which each class 
requires service from a random subset of the n facilities. Suppose that 
class j with arrival rate ); initially requires one server at each facility 
in subset A;, with probability p;,, where >). pj, = 1. We can represent 
this more general model within our framework by increasing the 
number of classes. New class (j, &) has a Poisson arrival process with 
rate \,pj;, and requires one server in each facility in the subset Aj,. 
This procedure is justified because of two familiar properties of Poisson 
processes: (1) independent random splitting of a Poisson process 
produces independent Poisson processes, and (2) the superposition of 
independent Poisson processes is a Poisson process (see Theorems 4.2 
and 5.3 of Ref. 4). 

Returning to the previous setting in which each class requires a 
fixed subset of facilities, we let b(A) be the probability that all servers 
are busy in at least one facility in the subset A (at an arbitrary time 
in steady state). Thus b(i) = b({i}) is the probability all servers are 
busy at facility 7. Since Poisson arrivals see time averages,” b(A;) is 
also the blocking probability for class j. [The blocking probability for 
class j would be Ye pj.b(Aj,) if class j required a random number of 
facilities, as described above. ] 
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It is not difficult to give the exact formula for b(A) using theory 
related to queueing networks,** but the formula is complicated, espe- 
cially when the numbers n and ¢ are large (see Theorem 4 in Section 
II). To appreciate the complexity, recall that the arrival rates },, 
service rates y;, and subsets of required facilities A; for class 7 can 
differ from class to class. Hence, our interest centers on developing 
bounds and approximations. 


71.2 Related work 


There is a substantial body of related literature. The problem treated 
here is connected, at least in spirit, to the theory of gradings and link 
systems in traffic engineering.’ The specific approximation problem is 
discussed by Holtzman.? Also somewhat related is the work on sto- 
chastic models of dynamic storage allocation.!°? Previous work also 
has been done on service systems, with waiting as well as blocking, in 
which customers require more than one server.” 

Our model is relatively elementary compared with many of these 
other models. Our analysis benefits by having blocking instead of 
waiting and by having each customer require exactly one server per 
facility. On the other hand, we address an issue typically not consid- 
ered in the papers in which customers require more than one server: 
Here there are constraints on which servers can be used; there must 
be one server from each facility. To make the comparison clear, it is 
useful to modify our model by considering one large facility containing 
all the servers in all the original facilities. If one of our customers 
requiring service from one server in each of k facilities could use any 
k servers in the single large facility, then we would have the model of 
Arthurs and Stuck.’® Here, however, there are constraints. 

The model considered here is in fact a special case of a more general 
single-facility blocking model of Kaufman,’ in which there is a general 
sharing rule. From Kaufman or Burman, Lehoczky and Lim,® we learn 
that our model is a product-form model with the insensitivity prop- 
erty.© This provides expressions for the exact blocking probabilities 
(Theorem 4 in Section 2.1 here), but as noted above this exact 
expression is quite intractable. The insensitivity property tells us that 
the blocking probabilities depend on the service-time distributions 
only through their means, so that there is no need to assume exponen- 
tial service-time distributions; for convenience we can replace general 
service-time distributions by exponential service-time distributions 
without affecting the blocking probabilities. We discuss insensitivity 
further in Sections 1.8 and IV. 

A special case of our model has also been analyzed in a database 
locking study by Mitra and Weinberger.”!”? In their model, the facili- 
ties are items in the database and the customers are transactions that 
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“touch” a specific set. of items. To maintain consistency, only one 
transaction is allowed to touch each item at any time, so that in their 
model transactions requiring items already being touched are blocked. 
Their model thus corresponds to the special case of our multifacility 
blocking model in which each facility has a single server. (It may be 
of interest to consider the extension of their model to multiserver 
facilities to represent multiple copies of database items in the data- 
base.) 

Mitra and Weinberger show that the analysis can be greatly simpli- 
fied by focusing attention on special symmetric versions of the model. 
They assume that the arrival rate and service rate for each customer 
class that requires k facilities (items) is independent of the subset of 
k facilities required. Moreover, they assume that there is a customer 
class for each subset of size k. Most important, they consider only the 
case of one server per facility. (It should be clear that the case of 
multiserver facilities is much harder.) For these special symmetric 
models, they obtain an efficient algorithm for calculating the partition 
function of the product-form model, from which the desired blocking 
probabilities are easily obtained. [For some database locking applica- 
tions, it may not be reasonable to assume that the arrival rates for all 
subsets of size k are identical. Then the approximation methods in 
this paper may be helpful. See Remark 3 in Section 1.6.] 

The symmetric case of the multifacility blocking model has also 
been considered by Heyman in the investigation of a communication 
system.”? We shall also discuss symmetric models here, beginning in 
Section 1.6. For symmetric models, the approximations are more 
reliable and much easier to compute. 

We have mentioned that this work was primarily motivated by 
performance analysis issues in packet-switched communication net- 
works, specifically in the PANDA software package.” The approxi- 
mations here have also been applied to study the blocking in an AT&T 
Bell Laboratories computer network” and an AT&T Communications 
model for overseas voice traffic.”” Another example of the multifacility 
blocking problem in telephony is contained in Akinpelu.”° The work 
that bears most directly on this paper is in Refs. 2, 3, 7, 8, 9, 21, and 
23 through 26. (Also see Section VIII.) 


1.3 Summary and organization of this paper 


We describe our main results in the rest of Section I and provide 
the supporting technical details in the remaining sections. We discuss 
three different approximation schemes: the summation bound, the 
product bound, and the reduced-load approximation. The two bounds 
are well-known approximations. The reduced-load approximation ev- 
idently has a long history,® but is not as well known as it deserves to 


1810 TECHNICAL JOURNAL, OCTOBER 1985 


be. We propose for the reduced-load approximation a successive ap- 
proximation scheme that is very easy to implement and seems to 
perform well. In particular, the reduced-load approximation with the 
successive approximation scheme is ideally suited for large models, 
where the exact formula becomes intractable. 

We obtain two major results about these approximation schemes: 
(1) As suggested by the names, the first two approximation schemes 
indeed yield upper bounds on the blocking probabilities, and (2) a 
limit theorem establishes that the third approximation scheme, the 
reduced-load approximation, is asymptotically correct for symmetric 
models as the size of the model grows, in a sense which we will make 
precise. It is significant that the limiting conditions do not correspond 
to light traffic as in Refs. 21 and 23, so that in this limit the reduced- 
load approximation can be very different from the bounds. Our two 
main results have mathematical interest as well as applied interest, 
because they are obtained by focusing on multidimensional stochastic 
processes that are not Markov. 

We also obtain two additional light-traffic results. First, we show 
that all three approximations are asymptotically equivalent as the 
loads decrease (Corollary 2.3). Second, we show that all three approx- 
imations are asymptotically correct as the loads decrease for symmetric 
models (Corollary 3.2). The qualification “for symmetric models” in 
the last sentence is important because it can happen for asymmetric 
models that all three approximations are equally bad in light traffic 
(see the remark at the end of Section 1.5). As we mentioned in Section 
1.2, the approximations are more reliable and easier to use with 
symmetric models, but we believe they are also very useful for asym- 
metric models when applied with some caution. 

We discuss the bounds in Section 1.4, the reduced-load approxima- 
tion in Section 1.5, and the main limit theorem in Section 1.6. We 
discuss numerical examples in Section 1.7; existence, uniqueness, and 
insensitivity of equilibrium blocking probabilities in Section 1.8; and 
an extension of the reduced-load approximation for non-Poisson ar- 
rival processes in Section 1.9. Additional technical details will be 
provided in subsequent sections. The main results and directions for 
future research are summarized in Section VI. 

Here are the principal conclusions from our analysis and limited 
numerical experience: For light loads, for example, blocking in the 
order of 0.01 or less, the elementary bounds are usually adequate 
approximations for engineering purposes. Having established that 
these approximations are bounds, we gain some peace of mind in 
knowing that the approximations are conservative. For higher levels 
of blocking, such as 0.05 and above, the reduced-load approximation 
typically does much better than the elementary bounds. Moreover, the 
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successive approximation scheme proposed for the reduced-load ap- 
proximation is very easy to use (Theorem 2), so that the reduced-load 
approximation seems very attractive when the loads need not be light. 
1.4 Bounds 

Let B(s, a) be the classical Erlang blocking formula associated with 


the M/G/s/loss service system with s servers and offered load a,?"78 
defined by 
B(s, a) = (a*/s!)/ 2 (a*/kl), (1) 


where, as usual, the offered load a is the arrival rate multiplied by the 
expected service time. Let C; be the set of all classes that request 
service from facility 1, that is, 


Ci = {jst © Aj}. (2) 


Let a; be the offered load at facility i (not counting blocking 

elsewhere), defined by 
a= Y aj, (3) 
JECi 

where a; = A;/p; is the offered load of class j to the system as a whole. 

In Section II we establish the following bounds. These bounds are 
standard approximations that have long been regarded as conserva- 
tive.’ We show that intuition is correct in this case. 
Theorem 1: (Product Bound) For each subset A, 

b(A) = 1- [I [1 — B(si, a)]. 
iGA 
Corollary 1.1: (Facility Bound) For each i, b(i) S B(s;, a). 
Proof: Let A = {i}. O 
Corollary 1.2: (Summation Bound) For each subset A, 
b(A) s & B(si, ai). 
iGA 

Proof: The summation bound in Corollary 1.2 is always greater than 
or equal to the product bound in Theorem 1, as is easily verified by 
induction on the number of facilities in A. Corollary 1.2 also follows 
directly from Corollary 1.1 and the Bonferroni inequalities (see page 
110 of Ref. 29). O 
Remarks: For the special case of two facilities, Corollary 1.2 has been 
proved by different methods by D. D. Sheng and D. R. Smith; see the 
appendix in Ref. 3. The simple approximation provided by the sum- 
mation bound in Corollary 1.2 was used in early versions of the 
PANDA software package,” before being replaced by the product bound 
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in Theorem 1.* The summation bound in Corollary 1.2 coincides with 
the asymptotic approximation developed by Mitra and Weinberger,” 
which is a light-traffic limit. (See Corollaries 2.3 and 3.2 below.) 

We give a separate proof of the facility bound in Corollary 1.1, 
which is of independent interest. We apply Theorem 5 of Smith and 
Whitt® to obtain a monotone-likelihood-ratio ordering for the number 
of busy servers (Theorem 5), which does not follow from our proof of 
Theorem 1. 

Our proof of Theorem 1 is based on a general technique for com- 
paring a non-Markov process to a Markov process using the condi- 
tional transition rates, which applies to many different definitions of 
stochastic order (Theorem 6). We apply this technique to prove 
Theorem 1 using the version of stochastic order for probability distri- 
butions on R” based on comparing cumulative distribution functions 
(Theorem 7). For n > 1, this stochastic ordering is weaker than the 
standard form of stochastic order based on all increasing sets. Our 
general approach for comparing a non-Markov process to a Markov 
process has much wider applicability, and is discussed further else- 
where.*! Our approach exploits stochastic monotonicity of the Markov 
process,°*”* and is closely related to the stochastic comparisons for 
multidimensional Markov processes by Massey that have been applied 
to establish comparisons for Markovian queueing network models.**-** 

The bound in Theorem 1 corresponds to independent blocking in 
the different facilities with the bound Corollary 1.1 used in each 
facility. It is natural to conjecture that Theorem 1 might be obtained 
from the more easily established Corollary 1.1 and the inequality 


b(A) s1- ][ [1 - d(i)] (4) 
JEA 
but (4) is not valid in general (see Example 6 in Section II). 

For typical applications in which the bounds are relatively small 
and customers require only a few facilities, the bounds usually are 
excellent approximations (see Corollary 3.2), but the following exam- 
ples demonstrate that the bounds are not always good approximations. 
Example 1: Suppose that all n facilities have s servers. Let there be 
only one customer class, which requests service from all n facilities. 
Then a; = a, and b({1, --- , n}) = b(1) = B(s, a1) = B(s, a), so that 
the bound in Corollary 1.1 is tight (an equality), but the bounds in 
Theorem 1 and Corollary 1.2 can be poor approximations. OU 
Example 2: Suppose that there are two facilities and two customer 
classes. Let s; = 10, so = 1, A; = {1}, Az = {1, 2}, ay = 1, and ay = 100. 
Then B(s,, &:) = B(10, 101) = 1, but b(1) = 0, because at most one 
class 2 customer can be in service at any time. Hence, in this case the 
bound in Corollary 1.1 is a very bad approximation. UO 
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1.5 A reduced-load approximation 


Since the approximations in Theorem 1 and its corollaries are upper 
bounds, it is natural to look for reduced values that might be better 
approximations. One way to do this is to reduce the offered load a; at 
facility 1 by taking into account the blocking elsewhere. It is natural 
to develop such an approximation within a framework of facility 
independence, that is, the assumption that the events of blocking at 
the difficult facilities are independent. This seems to be a reasonable 
approximating assumption for “typical” examples, which has been 
applied before for multiple facilities.’** As a consequence, we have the 
facility-independence approximation 


1 — b(A) = Jf [1 — bd). (5) 
iGA 
Next we introduce the following approximate total offered load at 
facility i-using the facility independence assumed above: 


a= ) a I] [1 — b()}. (6) 

jEC; REA; 

k#At 
In (6) we have reduced the offered load a; of class j at facility i by the 
blocking elsewhere. Of course, using (6) we make the offered loads 
dependent on the blocking probabilities as well as vice versa. [However, 
the facility-independence approximation in (5) greatly reduces the 
complexity.] Hence, this leads to a system of equations characterizing 
the blocking probabilities as our approximation. In particular, our 
proposed reduced-load approximation for the blocking probability at 

facility 1 is the solution to the following system of equations: 


b*(i) = Blo. x a Tf fl - sep. lsisn. (7) 
jeC; po 


From (1), we see that (7) yields n polynomial equations in the n 
unknowns b*(1), --- , b*(n). 

In general, solving a system of n nonlinear equations in n unknowns 
can be quite unpleasant. Of course, in many situations there are 
symmetries in the model, which allow us to reduce the number of 
equations (and variables). In fact, in the next section we discuss the 
totally symmetric model, for which (7) reduces to one equation in one 
variable, which is trivial to solve. (The database model in Ref. 21 also 
simplifies in this way.) However, we also propose a relatively simple 
computational scheme for solving the general system (7). In particular, 
we suggest using successive approximations, that is, iteratively apply- 
ing the right side of (7) to successive candidate vectors of blocking 
probabilities. The following theorem indicates that (7) always has a 
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solution and that the successive approximation scheme either finds 
the unique solution or provides upper and lower bounds on all solutions 
to (7). 

Theorem 2: (Existence and Successive Approximation) If s; and a; are 
strictly positive for each 1, then the system of eqs. (7) has a solution b* 
= [b*(1), --- , b*(n)] with 0 < b*(i) < 1 for all 1. All solutions b* can 
be bounded above and below, and sometimes found, by successive ap- 
proximation, that is, by iteratively applying the operator T = T{[b(1), 
--+, b(n)]} mapping [0, 1]” into itself defined by the right side of (7), 


starting with 1 = (1,1, --- , 1). In particular, successive applications of 
T yield the following upper and lower bounds on [b*(1), --- , b*(n)] for 
all k: 
(0,0, ---,0)=O= T(1) < T**1(1) < T?**3(1) 
< [b*(1), --- , b*(n)] < T***(1) < T?*(1) 
<T(1)=1=(1,1,---,1). (8) 


Proof: First, the operator T defined by the right side of (7) obviously 
maps [0, 1]” into itself. Since T is continuous, T has a fixed point, by 
the classical Brouwer fixed point theorem.” Let b* represent such a 
fixed point. Since the operator T is strictly decreasing, b(i) > b*(i) > 
T(b); for all 1, where b = (b(1), --- , b(n)), whenever b(i) > T(b); for 
all i and b(t) < b*(i) < T(b); for all i whenever b(i) < T(b); for all 7. 
Since T(1) = 0 and T(0) < 1,0< 6*(i) <1 foralli. O 

Since T is continuous and strictly decreasing, the iterated operator 
T” is continuous and strictly increasing. Hence, the successive ap- 
proximation scheme (8) converges in the sense that T?**1(1) + L and 
T?*(1) > U, where L = (1h, --- , L,) and U = (U,, --- , U,) are lower 
and upper bonds, respectively, on any solution to (7), that is, L(i) < 
b*(t) s U(i), 1 Si sn. Often we will have L = U, that is, L(i) = U(i) 
= b*(i) for all i, but not always, because 7?) can have more than one 
fixed point, as Example 3 below illustrates. Of course, from the 
monotonicity just discussed, it is clear that the successive approxi- 
mation scheme in (8) converges if and only if the two-step operator 
T” has only one fixed point. 

We have yet to thoroughly investigate when JT has a unique fixed 
point and when the successive approximation scheme (8) converges. 
Sufficient conditions for T' to be a contraction map on a complete 
metric space—so that 7’ has a unique fixed point and the successive 
approximation algorithm in (8) converges to it—are given in Section 
V, but these conditions are very strong. We make the following 
conjecture. (It has been proved; see Section VIII.) 


Conjecture 1: The reduced-load system of eqs. (7) always has a unique 
solution. 
It is easy to see that (7) has a unique solution in the case of two 
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facilities each with a single server. Extensive numerical testing sup- 
ports Conjecture 1 when there are only two facilities. 

It is, of course, natural to wonder whether the model itself might 
have multiple equilibrium points, but the exact stochastic process 
under consideration representing the number of customers from each 
class in service (with exponential service-time distributions) is an 
irreducible finite-state, continuous-time Markov chain, which neces- 
sarily has a unique equilibrium distribution (Section 2.1 below). Thus, 
if there are multiple solutions to (7), then they must be an artifact of 
the approximation. 

We now present an example to show that the successive approxi- 
mation in (8) can fail to converge. 


Example 3: (Nonconvergence) To see that the succesive approxima- 
tion scheme in (8) need not converge to a solution of (7), consider the 
symmetric model with three facilities, each with one server. Let there 
be only one customer class, which requires service from all facilities. 
Let the offered load be a. Then (7) consists of the three equations 

b*(1) = B{1, afl — b*(2)][1 — b*(3)]} 

b*(2) = Bil, af — b*(1)][1 — 6*(8)]} 

b*(3) = B{1, afl — b*(1)j[1 — 6*(2)]} 
in the three unknowns b*(1), b*(2), and b*(3). However, when 
we apply the operator 7’, we see that T maps the space of vectors 
(b;, bo, b3) with b; = be = bs into itself. Since we start with (1, 1, 1) in 
(8), we only need consider the associated operator 7’ on [0, 1], defined 
by 
Ni Behe _a(1 = 5)" 
T(b) = B[l, a(1 — b)*) i+ al — be 


The equation 7)(b) = b leads to the polynomial equation 
x> — x4 + 2a x? — Qa x? + (a + lha x — a 2 = 0 
for x = 1 — b, which factors as 
(x? —x + a )(x2 + a lx — a) = 0. 


The second cubic factor also arises as the solution to 7'(b) = b. This 
cubic polynomial is easily seen to be monotone, so that it has a unique 
root, which falls in the interval (0, 1). This is the unique symmetric 
fixed point to the symmetric model. The quadratic term has two roots 


x = (1+ V1 — 4a7')/2, 


which are real and distinct when a > 4. These two roots x; and x2 
satisfy 0 < x, <= 1 and x; + x. = 1. The quadratic factor does not have 
real roots when a < 4. 
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In the case a = 10, 7 has unique fixed point 6 = 0.607, which 
corresponds to the symmetric solution (0.607, 0.607, 0.607). However, 
T®) has three fixed points: 0.113, 0.607, and 0.887. Hence, the succes- 
sive approximation scheme (8) fails to converge to the unique sym- 
metric fixed point of T; instead it eventually oscillates between L = 
(0.113, 0.113, 0.113) and U = (0.887, 0.887, 0.887). The exact blocking 
probability in this case is 0.909, obtained directly from the Erlang loss 
formula (1). The reduced-load approximation for the customer block- 
ing probability is b*({1, 2, 3}) = 1 — (1 — 0.607)? = 0.939. O 
Remark: It is significant that with exactly two facilities, the successive 
approximation scheme in (8) converges if and only if T has a unique 
fixed point, that is, if and only if (7) has a unique solution. We have 
already noted that convergence of (8) is equivalent to T® having only 
one fixed point. Obviously, T inherits all fixed points of T, so that 
if T has multiple fixed points, the (8) will not converge. On the other 
hand, if (8) fails to converge, then the bounds (Jy, Lz) and (Ui, U2) 
obtained from (8) are two distinct fixed points of JT. In turn, 
(L,, U2) and (U,, Lz) are two distinct fixed points of T. 

This argument extends to certain multifacility models, which in- 
clude many applications of interest.”° Suppose that the set of facilities 
can be partitioned into two subsets such that each customer requires 
service from one facility in each subset. Let there be n, facilities in 
the first subset, numbered 1 S i S m, and nz facilities in the other 
subset, numbered n, + 1 Si S n, + no. If the successive approxima- 
tion (8) fails to converge, then (Ly, --+, Ln, Dnt, +++ Dn,tn,) and 
(Uy, »++ , Uny Unyti, +++ » Unjsn,) are distinct fixed points of T. It is 
easy to see that (Li, wey Enis Unt Eg Ui sne) and (Uj, nS Gigs 
Ly,+1, *+* » Lnytn,) are then distinct fixed points of T. O 

To summarize the proposed reduced-load approximation, we find 
approximate blocking probabilities at each facility 1 by solving (7). To 
solve (7), we suggest using the successive approximation (8). Succes- 
sive iterations yield upper and lower bounds on all solutions to (7). If 
the upper and lower bounds are sufficiently close, then we can stop 
and use the approximation with some confidence. If the successive 
approximation bounds are not close, then the whole approach is 
suspect and we suggest using any solution to (7) with caution. An 
advantage of solving (7) by (8) is that if (8) converges, then we know 
there is a unique solution to (7). Moreover, if (8) fails to converge, 
then we get a warning about the whole approach. Also, (8) is extremely 
easy to implement. Of course, if (8) fails to converge, then we can look 
for solutions to (7) by other methods. Alternatively, we might choose 
to use the final upper bound obtained from (8). 

After obtaining the approximate blocking probabilities at each fa- 
cility, [which usually is a solution to (7), but might not be], we obtain 
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the approximate total offered loads at each facility via (6) and the 
approximate blocking probabilities for each class via (5). 


Remark 1: To implement the successive approximation in (8) or 
otherwise solve (7), we need to be able to conveniently calculate the 
Erlang blocking probability eq. (1) and, for some methods such as 
Newton’s method, its derivatives. For this purpose, we can apply 
techniques of Jagerman.?”%5 0 


Remark 2: The successive approximation in (8) and associated bounds 
closely parallels a proposed successive approximation algorithm to 
approximately solve closed networks of queues with a decoupling 
infinite-server node in Section VI of Ref. 40. The analog in Ref. 40 of 
the operator T above necessarily has a unique fixed point and the 
successive approximation scheme also yields bounds on it. However, 
the successive approximation scheme in Ref. 40 also can fail to 
converge to the fixed point. Further discussion of the successive 
approximation in Ref. 40 will appear in a subsequent paper. UO 
Theorem 2 provides a way to relate the reduced-load approximation 
(7) to the bound in Corollary 1.1. In particular, we can bound the 
reduced-load approximation (7) much as we already bounded the exact 
blocking probability at facility 1. 
Corollary 2.1: The reduced-load blocking approximation at facility 1, that 
is, any b*(t) obtained from (7), satisfies 


B {s yy I] [1 - Bese, aw} < b*(i) < B(s;, ai). 
jEC; REA; 
ki 

Proof: The upper bound is T?(1) and the lower bound is T°(1) in the 
successive approximation (8). O 

Let b*(A) be the reduced-load approximate blocking probability for 
the subset A obtained by combining (5) and (7). From (5) and Corollary 
2.1, we immediately obtain the following bounds for b*(A). 
Corollary 2.2: For each subset A, the reduced-load blocking approxima- 
tion b*(A) satisfies 


1- |] ( —B {s > Qj II [1 - Bls:, aw} 
iGA jECG; REA; 
kAi 
= b*(A) =l- I [1 _ B(s;, a;)]. 

iGA 
Note that we have not yet given any lower bounds for the exact 
blocking probabilities. Obviously, b(A) = max{b(i):1€A}, but it seems 
hard to obtain an improvement. One might conjecture that the exact 
blocking probability b(t) at facility 1 is bounded below by the lower 
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bound in Corollary 2.1, but the following example shows that this is 
not the case. 
Example 4: To see that the lower bound in Corollary 2.1 is not a lower 
bound on the exact blocking probability, suppose that there are three 
facilities and two customer classes. Let s; = so = 1, s3 = 3, A; = {1, 3}, 
Ao = {2, 3}, and a; = az = a. Then b(3) = 0 because at most two of the 
three servers can be busy at the third facility because of the constraints 
elsewhere. However, it is easy to see that the lower bound in Corollary 
2.1 is strictly positive. O 

As a further consequence of Theorem 2, we can show that the 
bounds in Theorem 1 and its corollaries and the reduced-load approx- 
imation in (7) are all asymptotically equivalent as the offered loads 
per facility become negligible, that is, as a; — 0 for all 1. 


Corollary 2.3: If a; —> 0 for each i, then 
(i) B(s:, a:)/(ai/s;!) — 1, 
(ii) b*(i)/B(s;, a:) > 1, 


(iii) {1 pan II {1 _ B(s;, aot / >, B(s:, ai) — 1 
iGA iGA 


(iv) ona) / Y b*(i) > 1 
iA 


(v) way / {1 = Il [1 — B(si, aah 1 


for all subsets A, where b*(i) and b*(A) are the reduged-load approxt- 
mations based on (5) and (7). 

Proof: Part (i) follows immediately from the form of the Erlang 
blocking formula in (1). Part (ii) follows from Corollary 2.1 after 
dividing each term by B(s;, a;) and letting a; — 0 for all 1. To establish 
the limit for the lower bound, let a = max{a;, 1 < i Ss n} and § = 
min{s;, 1 sis n}. Then 

II [1 — B(s:, a)] = [1 — BCs, a)" 


kEA; 
kAi 


for all i and j, and [1 — B(&, a)]""' > 1 as a; > 0 for all i. Hence, 
/ 
B {s x a; I] [1 — Blse, it} > Bis;, afl — B(s, a)]"} 
JEC; kEA; 
kei 
and 
Bis;, a;[1 = B(s, a)"")}}/B(s;, ai;) — 1 
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as a; — 0 because B(s;, a;x)/B(s;, @;) — x*' as &; — 0 uniformly in x 
in any compact subinterval of (0, %). Given (i) and (ii), (iii) through 
(v) are elementary. UO 

Corollary 2.3 demonstrates that using the more elementary approx- 
imations in Theorem 1 and its corollaries instead of the reduced-load 
approximation (7) is justified if the loads are sufficiently light. Theo- 
rem 1 and Corollary 2.3 also suggest that the reduced-load approxi- 
mation b*(A) itself might be an upper bound, but the following 
example shows that the reduced-load approximation b*(i) obtained 
from (7) is not an upper bound in general. 


Example 5: To see that the reduced-load approximation is not an 
upper bound, let there be two facilities, each with one server. Let there 
be two classes with A, = {1, 2} and As = {1}, so that a, = a; + ae and 
a2 = a,. The reduced-load approximation is determined by the two 
equations 


b*(1) = B{1, arf{1 — b*(2)] + ap} 
b*(2) = Bf{1, a1 — b*(1)]}, 


from which we easily deduce that 0 < b*(i) < 1 for each i, so that 
b*(A2) = b*(1) < B(1, ay + a) = O(1). OF 

Remark: One might conjecture that all the approximations for the 
exact blocking probability b(i) are asymptotically correct as the loads 
decrease, but Example 2 shows that this is not nearly the case. If a2 = 
100a, there, then b(1)/ai® — 101, while B(s;, &)/ai° —> (101)'° as 
a, — 0. However, a positive result for large symmetric models appears 
in Corollary 3.2 below. O 


1.6 Symmetric solutions to symmetric models 


In this section we consider the special case of symmetric models in 
which all facilities have s servers and offered load a, and all classes 
require service from m facilities. To have full model symmetry, we 
also assume that there is a class requiring service from each subset of 
m facilities, and that the offered loads are the same for each class. We 
also assume that the arrival rates and service rates are the same for 
all classes. 

If we restrict attention to symmetric solutions to symmetric models, 
then the reduced-load system of eqs. (7) simplifies to the single 
polynomial equation in one variable 


b* = B(s, a(1 — b*)"™”), (9) 


where 6*(z) = b* for all i. Since the right side of (9) is continuous and 
decreasing as a function of b*, (9) has a unique solution, which is easy 
to find. 
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Note that we have restricted attention to symmetric solutions of (7) 
in order to obtain the single eq. (9). We have not yet ruled out 
asymmetric solutions to symmetric models. However, we conjecture 
that none exist. (See Section VIII.) 


Conjecture 2 (Corollary to Conjecture 1): The symmetric solution [that 
is, the solution to (9)] to the reduced-load approximation egs. (7) is the 
only solution for a symmetric model. 

To investigate the accuracy of the approximation (5) through (9), 
we investigate the asymptotic behavior of symmetric models as 
n — © with the offered load per facility, a, and the number of facilities 
required per class, m, held fixed. In Section III we prove that the 
approximation (5) through (9) is asymptotically correct as n — 
under these conditions. Note that since we fix the offered load per 
facility, a, this limit does not correspond to light traffic. 

To state the main result, let Y,; be the number of busy servers at 
facility «1 and let Z,, be the proportion of the facilities with k busy 
servers (both in steady state) when there are n facilities. Let ”, denote 
convergence in probability. 

Theorem 3: If n—© with a and m held fixed for the symmetric model, 
then 

(a) Zar —> Bpasn— © for each k, where ; satisfies the M/G/s/loss 
formula 


Bu = (E*/R!) i, x (e'/1!) (10) 
with 
= a(1 — 6)"; (11) 


that is, B, is the unique symmetric solution to (9). 
(b) For any finite subset H, the random variables Yni, i € H, are 
asymptotically mutually independent as n > »., 


We establish Theorem 3 in Section III by first focusing on the 
stochastic process representing the proportion of facilities with k busy 
servers at time t, 1 < RSs andt = 0. The key result is a functional 
law of large numbers for this sequence of stochastic processes as 
n—> © (Theorem 8). The analysis is challenging because this stochastic 
process is not Markov. 

From Theorem 3, we easily obtain our desired corollary. 

Corollary 3.1: For symmetric models, the symmetric reduced-load ap- 
proximation in (5) through (9) is asymptotically correct as n > © with 
a and m held fixed. 

We can combine Corollaries 2.3 and 3.1 to conclude that the bounds 
in Theorem 1 and its corollaries are also asymptotically correct with 
light loads. 
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Corollary 3.2: For symmetric models, the bounds in Theorem 1 and its 
corollaries are asymptotically correct as n — © and a — 0; that is, for 
each integer k and each positive «, there is a critical offered load ap and 
an integer-valued function n(a) such that 


b(A) 


PBGsa) 


<€, 








for all a < a and n= n(a), where A is a subset with k facilities. 
Example 5 shows that the reduced-load approximation is not an 
upper bound on the actual blocking probability in general. Example 1 
shows that the symmetric reduced-load approximation b*(i) for the 
blocking probability at each facility in a symmetric model need not be 
an upper bound either. However, we make the following conjecture. 


Conjecture 3: The reduced-load approximation for the blocking proba- 
bility of each customer in a symmetric model in which the number of 
facilities per customer is fixed, obtained by combining (5) and (9), is 
always an upper bound. 

Remark 1: In our symmetric model each customer requires service 
from m facilities. Instead, as in Ref. 21, we could have different types 
of customers, with customers of type m requiring service from m 
facilities. The facilities remain symmetric with this change, so that if 
we still restrict attention to symmetric solutions to the symmetric 
model, then we again obtain a single polynomial equation in one 
variable. In particular, suppose that we have M types, numbered from 
1 to M. If we let &,, be the total offered load of type m at each facility, 
then we obtain (9) with the second argument of B replaced by 
YM 1Qm(1 — b*)™"1. Consequently, it is easy to approximately solve the 
models in Ref. 21 and generalizations in which each facility has s 
servers. With this extended symmetric model we abandon Conjecture 
3. It is easy to get a counterexample by modifying Example 1 to 
introduce additional customers that require service from only one 
facility and have negligible offered load. O 


Remark 2: The reduced-load approximation has the potential of being 
a powerful and flexible approximation tool if we judiciously control 
the amount of symmetry. For example, we can obtain a richer class of 
database locking models by requiring only partial symmetry. Some 
regions of the database may be requested much more than others. 
There may also be a tendency for the items requested in a given 
transaction to cluster together. These general features can be repre- 
sented by partitioning the database into mutually exclusive subsets 
and assuming symmetry only within each subset. In addition, we can 
introduce various types of transactions, as in Remark 1 above. The 
partial symmetry causes the reduced-load approximation to be a 


1822 TECHNICAL JOURNAL, OCTOBER 1985 


system of k equations in k unknowns, where k is the number of subsets 
in the partition. The number of transaction types does not increase 
the number of equations. Again, the successive approximation (8) can 
be applied. O 

Remark 3: Mitra and Weinberger established Corollary 3.2 for multi- 
ple-customer types in the special case of one server per facility via 
their asymptotic analysis.2> Heyman also has a different proof of 
Corollary 3.2 in the special case of one server per facility, assuming 
that the total offered load in the network is fixed asn > 0.7 


1.7 A few numerical examples 


Table I compares the approximations in Theorem 1 and its corol- 
laries with the reduced-load approximation in (5) through (9) for 
several symmetric models. The various approximations were calcu- 
lated “by hand” at the terminal using the Erlang blocking formula 
algorithms of Jagerman”® (coded by Moshe Segal). The approximations 
are all independent of the number of facilities, so n is not specified. 
Based on Theorem 3, the reduced-load approximation in (9) is asymp- 
totically correct for large n. The offered load per facility a in (8) is 
chosen so that the nominal blocking per facility (the bound in Corol- 
lary 1.1) has a specified value: 0.10 in the first six cases, 0.02 in the 
next three cases, and 0.01 in the last three cases. 

From Table I (and intuition), it is apparent that the quality of the 
bounds as approximations is a decreasing function of the number s of 
servers per facility, the offered load per facility a, and the number m 
of facilities per class. In the case of nominal blocking per facility of 


Table I—The approximate blocking probability for each customer 
class in symmetric models: a comparison of the approximation 


procedures 
Summation Reduced- 
Servers per Offered Load Facilities per Bound in Product Load Ap- 
Facility per Facility Class Corollary Bound in proximation 
s a m 1.2 Theorem 1 (9) 
1 0.11111 2 0.200 0.190 0.175 
10 7.51 2 0.200 0.190 0.146 
50 49.6 2 0.200 0.190 0.126 
1 0.11111 3 0.300 0.271 0.234 
10 7.51 3 0.300 0.271 0.178 
50 49.6 3 0.300 0.271 0.157 
1 0.0204 5 0.100 0.096 0.089 
10 5.087 5 0.100 0.096 0.072 
50 40.27 5 0.100 0.096 0.057 
1 0.010101 2 0.0200 0.0199 0.0197 
10 4.464 2 0.0200 0.0199 0.0192 
50 37.90 2 0.0200 0.0199 0.0180 
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Table II—Examples of the successive approximations 
in (8) for the reduced-load approximation with the 
symmetric model in the first three cases of Table | 


Servers per facility s 1 10 50 
Offered load per facility a 0.11111 7.51 49.6 
Facilities per class m 2 2 2 
Bound in Corollary 1.2 0.200 0.200 0.200 
Bound in Theorem 1 0.190 0.190 0.190 
Bound in Corollary 1.1 0.100 0.100 0.100 
Iteration one 0.089 0.069 0.051 
Iteration two 0.0919 0.078 0.074 
Iteration three 0.0916 0.076 0.063 
Iteration four 0.0917 0.077 0.068 
Reduced-load approx. (9) 0.0917 0.076 0.065 
Approximate blocking for each 0.175 0.146 0.126 


class by (5) and (9) 


0.01 and only two facilities required per class (the last three cases), 
the simple summation bound in Corollary 1.2 seems to be adequate. 
However, the case of s = 50 and m = 5 produces perhaps a surprisingly 
large discrepancy between (9) and the bounds. 

Table II displays the outcomes of the successive approximations in 
(8) applied to the first three cases in Table I. The successive iterations 
describe the blocking per facility, as in (7) and (9). Then (5) is applied 
to obtain the blocking per class. From Table II it is apparent that 
about five iterations yields adequate accuracy, that is, getting close 
enough to the fixed point (9). In these examples the successive ap- 
proximation scheme in (8) converges to the unique symmetric fixed 
point of (9). 

Table III compares the approximations with exact blocking proba- 
bilities for different numbers of facilities in the special case of a 
symmetric model with s = 1 (one server per facility) and (m = 2) (two 
facilities required per class). When s = 1, the exact blocking probability 
is relatively easy to compute because, with exponential service times 
having mean one (which we can assume without loss of generality by 
Theorem 4 and Corollary 4.2), the number of customers in service 
(which is the number of busy servers divided by m) is a birth-and- 
death process with death rate u(k) = k and birth rate 


eben ee (12) 


ne) = (nad ( n(n — 1)---(n-m+1) 


The data in Table III for this special case were obtained from D. P. 
Heyman (personal communication). This case is consistent with Theo- 
rem 3, which establishes that (9) is asymptotically correct as n > ~, 
Table III leads us to conjecture that the exact blocking probability for 
each class is increasing in n in this case. More generally, we make the 
following conjecture. 
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Table I!I—Comparison of approximations with exact blocking 
probabilities for symmetric models when s = 1 (one server per 
facility) and m = 2 (two facilities required per class) 


Reduced- Summation 
Number of Offered Load Exact Load Ap- Product Bound in 
Facilities per Facility Blocking proximation Bound in Corollary 
n a Probability (9) Theorem 1 1.2 
2 0.010101 0.0100 0.0197 0.0199 0.0200 
4 0.0165 0.0197 0.0199 0.0200 
8 0.0183 0.0197 0.0199 0.0200 
40 0.0195 0.0197 0.0199 0.0200 
100 0.0196 0.0197 0.0199 0.0200 
2 0.111111 0.100 0.175 0.190 0.200 
4 0.154 0.175 0.190 0.200 
8 0.168 0.175 0.190 0.200 
40 0.1735 0.175 0.190 0.200 
100 0.1744 0.175 0.190 0.200 
2 1.0000 0.500 0.618 0.750 1.000 
4 0.600 0.618 0.750 1.000 
8 0.611 0.618 0.750 1.000 
40 0.6168 0.618 0.750 1.000 
100 0.6176 0.618 0.750 1.000 


Conjecture 4: The exact blocking probability for each customer class in 
a symmetric model is a nondecreasing function of the number n of 
facilities when the offered load per facility & and the number m of 
facilities per customer are held fixed. 


Remark: Conjecture 3 is a corollary to Conjecture 4 and Theorem 
3. O 

For typical blocking probabilities (0.001 through 0.2), the quality of 
the approximations appears to be a decreasing function of the offered 
load per facility (or nominal blocking probability), but this is evidently 
not true over the full range. The middle four cases in Table III provide 
greater relative differences than the last four cases, comparing (9) 
with the exact values. 

As our final example in this subsection, we consider a communica- 
tion network with traffic from several different sources to a common 
destination, as depicted in Fig. 1. Traffic from each source needs two 
lines: one line in a facility associated with that source plus one line in 
a final facility shared by all sources. When there are n sources, there 
are n customer classes and n + 1 facilities. For each 1, 1 <i <n, class 
i requires one server from facility i and one server from facility n + 1. 
Note that this example has the special structure mentioned in the 
remark following Example 3, so that for the reduced-load approxima- 
tion the successive approximation scheme in (8) converges if and only 
if the operator T has a unique fixed point. 

Tables IV and V give numerical results obtained by J. T. Wittbold” 
for several cases in which n equals 2 and 3, respectively. We display 
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Fig. 1—A communication network with four facilities and three customer classes. 


the exact blocking probability and the reduced-load approximation for 
each facility and for each customer class. The successive approxima- 
tion converged quickly in every case. For the customer classes, we also 
display the product bounds and the approximation obtained by taking 
the product of the exact facility nonblocking probabilities (the last 
column). This last column helps assess how much of the error is due 
to assuming facility independence. 

For the cases with high blocking probabilities, for example, Case 1 
in Tables IV and V, the reduced-load approximation is much better 
than the product bound, as expected. Overall, the reduced-load ap- 
proximation appears adequate for engineering purposes. For lighter 
loads, for example, Cases 4 through 6 in Table IV and V, the product 
bound seems adequate for most engineering purposes. It should be 
effective for properly sizing facilities given forecasting data. 


1.8 Existence, uniqueness, and insensitivity 


It is significant that we have assumed nothing about the service- 
time distributions except that they have finite means. For applications, 
experience indicates that call attempts can often be modeled reason- 
ably by a Poisson process, but that virtual circuit holding-time distri- 
butions are often not nearly exponential.*?*? In Section IV we rigor- 
ously establish that a steady-state blocking probability exists, is 
unique, and depends on the service-time distributions only through 
their means. For this, we apply the theory of Generalized Semi-Markov 
Processes (GSMPs) and the associated theory of insensitivity.**** 

It turns out that the model we consider also can be regarded as a 
special case of a model analyzed by Kaufman’ of blocking in a single 
facility in which customers request several servers and there is a 
general resource-sharing policy. The connection to Kaufman’s single- 
facility model is made by simply combining our n facilities and 
implementing our sharing scheme as one of his general sharing policies. 
The insensitivity property and the exact formula for the blocking are 
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Table 1V—Comparison of approximations with exact blocking probabilities for the communication network example 
in Fig. 1 with n = 2 sources 


Blocking Probability at 





Facility 1 Blocking Probability for Customer Class i 
Number of Offered Product of 
Case Facility Servers Load Reduced Product Reduced Exact Proba- 
Number Number Si ai Load Exact Bound Load Exact bilities 

1 20 30 0.209 0.199 0.61 0.42 0.41 0.43 
1 2 15 15 0.059 0.025 0.48 0.31 0.30 0.31 
n+l 30 —_ 0.266 0.293 — —_— — —_— 
1 30 30 0.067 0.041 0.29 0.19 0.17 0.19 
2 2 15 15 0.116 0.101 0.33 0.23 0.23 0.24 
n+1 40 — 0.183 0.151 —_— — — —_— 
1 30 30 0.006 0.000 0.45 0.36 0.36 0.36 
3 2 15 15 0.029 0.017 0.48 0.38 0.37 0.38 
n+1 30 — 0.360 0.365 — — — — 

1 40 30 0.008 0.002 0.068 0.055 0.050 0.051 

4 2 20 15 0.034 0.030 0.097 0.080 0.075 0.077 
n+l 50 — 0.048 0.049 — —_ — — 

1 42 30 0.006 0.006 0.023 0.021 0.017 0.017 

5 2 23 15 0.011 0.012 0.029 0.026 0.023 0.024 
nt+1 56 — 0.014 0.015 —_ _— — —_— 

1 42 27 0.0017 0.0012 0.005 0.005 0.004 0.004 

6 2 23 13 0.0036 0.0033 0.007 0.007 0.006 0.006 
n+1 56 — 0.0027 0.0027 — — _ — 
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Table V—Comparison of approximations with exact blocking probabilities for the communication network example 
in Fig. 1 with n = 3 sources 


Blocking Probability at 


Facility i Blocking Probability for Customer Class i 
Number of Offered Product of 
Case Facility Servers Load Reduced Product Reduced Exact Proba- 
Number Number Si Qj Load Exact Bound Load Exact bilities 
1 40 30 0.001 0.000 0.30 0.19 0.19 0.19 
l 2 10 20 0.446 0.452 0.67 0.55 0.55 0.56 
3 20 17 0.027 0.018 0.35 0.21 0.20 0.21 
n+1 50 — 0.190 0.192 = = — a 
1 40 30 0.008 0.003 0.16 0.06 0.05 0.05 
9 2 10 20 0.523 0.523 0.61 0.54 0.54 0.54 
3 20 17 0.067 0.063 0.22 0.11 0.10 0.11 
n+1 61 — 0.044 0.044 — — — —_ 
1 40 30 0.012 0.009 0.05 0.03 0.02 0.02 
3 2 10 8 0.115 0.116 0.15 0.13 0.13 0.13 
3 20 17 0.078 0.079 0.12 0.10 0.09 0.09 
n+1 63 — 0.020 0.016 — — — — 
1 40 30 0.013 0.013 0.03 0.02 0.02 0.02 
4 2, 10 8 0.119 0.121 0.13 0.13 0.12 0.12 
3 20 17 0.083 0.085 0.10 0.09 0.09 0.09 
n+1 67 — 0.007 0.003 — — —_ — 
1 40 30 0.013 0.011 0.028 0.024 0.019 0.020 
5 2 10 5 0.017 0.017 0.031 0.028 0.025 0.026 
3 20 13 0.017 0.016 0.031 0.028 0.024 0.025 
n+1 60 — 0.011 0.009 — — — — 
1 40 28 0.0059 0.0045 0.016 0.014 0.011 0.012 
6 2 10 4 0.0050 0.0048 0.015 0.014 0.012 0.012 
3 20 12 0.0091 0.0084 0.019 0.018 0.016 0.016 
n+1 57 —_— 0.0085 0.0075 — —_ — — 


thus available from Ref. 7. (Reference 7 also mentions other related 
work.) We contribute to Ref. 7 by verifying the conjecture on p. 1477 
there that the insensitivity property holds for arbitrary service-time 
distributions, not just service-time distributions with rational Laplace- 
Stieltjes transforms. (The insensitivity analysis for our model extends 
to the setting of Ref. 7, but the bounds and approximations do not.) 

Insensitivity properties in queueing have a long history, going all 
the way back to Erlang.*” Insensitivity theory for queueing networks 
is largely due to Baskett, Chandy, Muntz and Palacios*® and Kelly.*® 
It is now understood” that this theory can be viewed as a consequence 
of the earlier work by Matthes” on “bedienungsprozesse” or GSMPs. 

As Kaufman observes,’ his model is equivalent to a closed multiclass 
BCMP network*® with the addition of extra population constraints. 
Without the population constraints, we could simply apply the insen- 
sitivity theory developed by Baskett et al.*® and Kelly,*® which was 
extended to arbitrary service-time distributions by Barbour;*” for 
example, we could apply Section 3.3 of Ref. 6). As observed by Lam,” 
it is possible to extend the insensitivity theory to-closed networks with 
population constraints, but it is perhaps more appropriate to recognize 
that the closed network, with or without population constraints, is a 
GSMP, and the insensitivity theory for GSMPs can be applied directly. 
The direct approach via GSMPs is contained in Burman et al.® The 
analysis in both Kaufman’ and Burman et al.® requires the addition 
of Ref. 46 to treat arbitrary service-time distributions. The technical 
details here for establishing existence, uniqueness, and insensitivity 
appear in Section IV. 


1.9 Non-Poisson arrival processes 


We now indicate how the reduced-load approximation (5) through 
(7) can be combined with previous approximations for the blocking in 
a single facility with non-Poisson arrival processes to generate ap- 
proximations for blocking probabilities in the multifacility model here 
when we relax the assumption that the arrival process of each class is 
a Poisson process. 

We assume that the arrival process of each class is a general 
stationary point process” partially characterized by its arrival rate )j; 
and peakedness z;. (The facilities are thus G/GI/s/loss systems instead 
of M/GI/s/loss systems. See Refs. 55 through 57 and references in 
these sources for background on peakedness.) As before, we assume 
that the arrival processes of the different classes and all the service 
times are mutually independent. 

We regard the arrival process at facility i as the superposition of the 
arrival processes of those classes requiring service from facility 1. 
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Hence, paralleling (3), we define the peakedness of the arrival process 
at facility i as 

2; = b> aj2;/Gi, (13) 

JEC; 

where aq; is the offered load and z; is the peakedness for class 7. Formula 
(13) is the standard peakedness approximation for a superposition 
process, but it is based on the assumption that the service rates are 
the same for all classes, which is not necessarily the case here. Since 
we do not account for this difficulty, (13) should perform better if the 
service rates yu; do not vary much. References 55 through 57 describe 
ways to determine the peakedness z; for each class; one relatively 
simple way is the heavy-traffic approximation in (4) of Ref. 57, but 
other more involved methods are usually more accurate. 

For our new reduced-load approximation, we again use (5) and (6). 
We propose Hayward’s approximation to extend (7).°°°” However, 
with the non-Poisson arrival processes we must first carefully distin- 
guish different notions of blocking. Let b,(A), bc;(A), and b7(A) be the 
probability that all servers are busy in at least one facility in the set 
A at the instant of an arbitrary arrival, an arrival to facility 1, and at 
an arbitrary time, respectively (the overall call congestion, the facility- 
i call congestion and the time congestion). Let b; and b,(t) be the 
blocking probability for class j overall and at facility i, respectively. 
We are primarily interested in b;, bc;(i), and b7(A). 

We apply Hayward’s approximation to approximate bc;(i) as if 
facility i were in isolation. We use the peakedness Z; in (13) to modify 
(7) in the usual way: 


bci(i) = B(s;/2:, ai/2i), (14) 


where B(s, a) is the Erlang blocking formula in (1) extended to 
noninteger s, as described in Refs. 27 and 28, and instead of (6) a; is 


a= Y o; [J [1 — br(k)}. (15) 
JEC, REA, 


In (15) we use b7(k) to approximately represent the blocking proba- 
bility at facility k seen by an arbitrary arrival to facility 1. This involves 
an aspect of the basic facility independence approximation in (5). 

We obtain the approximate time congestion for facility 1 by using 
the approximation 


bay Sbaliy/2: | (16) 


(see page 695 of Ref. 57). [A significant improvement should be 
possible by using (16) of Ref. 56 with the equivalent random method 
instead of (16) above.] Hence, instead of (7), we obtain the following 
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system of n equations in the n unknowns bc;,(1) by combining (14) 
through (16): 


bci(i) = B (s/s (1/2;) x Qj i i be /2. (17) 


Since Z; are fixed positive scalars in (17), the successive approximation 
in (8) applies here as well; that is, Theorem 2 and Corollaries 2.1 and 
2.2 extend easily. 

Given that we have obtained bc;(i) and br(i) via (16) and (17), we 
combine (5) and (16) to obtain the time congestion for an arbitrary 
subset A, that is, 

br(A) = 1— [I [1 — br(i)]) = 1 — TT {1 — [bei(i)/2i]}. (18) 
iGA iGA 

Next we obtain the blocking for class j at facility i by combining our 
approximations with Fredericks’ approximation for parcel blocking, 
(23) in Ref. 56. We obtain 


(2; — 1) 
(2; — 1) 


Finally, we obtain the overall blocking for class 7 by combining (5) 
and (19), that is, 


b; (7) = br(i) + 





[bci(t) — br(t)]. (19) 


b=1- TT [1 — (a). (20) 
iEA; 
The approximations for bc;(1), br(A), and 6; in (17), (18), and (20) 
have yet to be tested, but experience with the individual approximation 
steps suggest that the combined procedure is promising. 


il. THE BOUNDS 
2.1 The exact blocking formula 


As a basis for proving Theorem 1, we first calculate the exact 
blocking probabilities b(A). For this purpose, let N; represent the 
steady-state number of class j customers in service. The distribution 
of the vector (Ni, ---, N.) is conveniently described in terms of the 
random vector (NY, --- , NZ), where Nj represents the steady-state 
number of class 7 customers in service when all n facilities have 
infinitely many servers, but otherwise the model is the same. Of course, 
in the infinite-server model the steady-state distribution is easy to 
describe because there is no blocking, so that there is no interaction 
among the classes; that is, the random variables NY, ---, N¢ are 
independent. From basic results for the M/G/ congestion model,® 
the steady-state distribution is 
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c c k; 
P(N? =k, 1sjso)=[. PINP=R)=T(S) ev 
j=l j=l k;! 


As in closed Jackson networks of queues, the steady-state distribu- 
tion of (Ni, ---, N.) is obtained from (21) by simply conditioning. 
(See Section 1.6 of Ref. 6.) We defer the proof until Section IV. 
Theorem 4: The steady-state distribution of (Ni, ---, N.) exists, is 
unique, depends on the service-time distributions only through their 
means, and has the form 


P(N; = kj, 1 Sj Sc) 





=P(Np= m1 sise > Np ss, 1sisn] 


JEC; 
P(N? = kj, 1 <j <o) 


P( > Nf ss,1sisn) 


JEC; 


Of course, Theorem 4 can be used to give an exact expression for 
the blocking probability b(A). Let Y; represent the number of busy 
servers at facility i in our model, that is, Y; = Yyjec, Nj. 

Corollary 4.1: For each subset A, b(A) = 1 — P(Y;< s;,1€ A). 

However, we apply Theorem 4 only via the following elementary 
consequence. 


Corollary 4.2: The distribution of (Nf, ---, Ne) and thus also the 
distributions of (Ni, --- , N.) and (Yi, ---, Y,) depend on the vectors 
of arrival rates (Xi, «++ , Ac) and service rates (u, +--+ , ue) only through 
the vector of offered loads (a, --- , a), where aj = dj/p;. 

Remark: It is significant in Corollary 4.2 that there is not just one 
degree of freedom, corresponding to the choice of our measuring unit, 
but c degrees of freedom. For example, we can arbitrarily select the 
service rate y; for each class j, as long as the offered load a; is as 
originally specified. In fact, for us it will be convenient to make all 
service rates identical. (See the proofs of Theorems 5 and 7.) 


2.2 Proof of Corollary 1.1 


To give a direct proof of Corollary 1.1, we establish a stronger 
stochastic comparison. Let N(s, a) represent the steady-state number 
of busy servers in an M/G/s/loss system with s servers and offered 
load a. We use the notion of Monotone-Likelihood-Ratio (MLR) 
ordering.*° An integer-valued random variable X, is said to be less 
than or equal to another integer-valued random variable X2 in the 
MLR ordering, denoted by X, S, Xo, if 
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P(M =k +1) _ P(X =k + 1) 
P(X, =k) ~— P(X,=k) 


for all k. (We also require that the supports be ordered intervals; that 
is, P(X; = k) > 0 for integers k € [a;, b;], where —~ S a; < b; = +0, 
@, S a, and b; S be.) The MLR ordering is useful largely because it 
implies ordinary stochastic order, namely, 


Ef (Xi) = Ef (X2) (23) 


for all nondecreasing functions f for which the expectations are well 
defined.*°*° 

Theorem 5: For each facility i, Y; Ss, N(s;, o;). 

Proof: First, make all service-time distributions exponential with 
mean one, without altering any of the offered loads. By Theorem 4 
and Corollary 4.2, this does not alter the steady-state distribution of 
(Ni, ---, N.). Next apply Theorem 5 of Ref. 30. The service rate in 
both systems is k when there are k busy servers. The arrival rate at 
facility « in the actual system is always less than or equal to q;. It is 
less when there is blocking elsewhere. Of course, the support of both 
random variables is the set {0,1,---,s;}. O 

Proof of Corollary 1.1: Apply Theorem 5 and (23), noting that 


b(t) = P(Y; = s;) = P[N(s;, &) = s;] = B(s;, a:). O (24) 


Having proved Corollary 1.1, we immediately obtain Corollary 1.2 
by virtue of the Bonferroni inequalities (see page 110 of Ref. 29). 


(22) 


2.3 Plausible stochastic comparisons 


It is natural to conjecture that Corollary 1.2 could be improved to 
Theorem 1 by exploiting the exact relationship in Corollary 4.2 and 
establishing the inequality (4) or, equivalently, that 


P(Y; < Si, 1 EA)= I P(Y; < s;). (25) 
i€A 


Formula (25) would follow from the random variables Y;, i € A, being 
associated or just positively quadrant dependent (see pages 29 and 142 
of Ref. 58). Unfortunately, however, (25) is not true in general, as we 
show in Example 6 below. 

One might also try to establish Theorem 1 via certain multivariate 
stochastic comparisons. In particular, it is natural to consider the 
multivariate versions of the MLR ordering <, and the stochastic 
ordering <, defined in (22) and (23) (see Refs. 59 and 60). The 
extension of <,, is defined again by (23). It is natural to conjecture 
that 


(Y,, aaa Yn) Sst [Ni (si, a1), mi eg N,(Sn, an)I, (26) 
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where the variables N;(s;, a;) are mutually independent. It is also 
natural to conjecture the weaker relationships 


P(Y;=k, l1sisn)s I P[N(si, a) = Ri] (27) 
and 

P(Y;< ki, l1sisn)= I P{N(s;, a;) S R;] (28) 
for all n-tuples (k,, --- , k,). However, in Example 6 below we show 


that (27) is not valid, which implies that (26) and the stronger ordering 
with <, instead of <, in (26) are not valid either. However, it turns 
out that (28) is valid, and that is the key to establishing Theorem 1. 
Example 6: To see that (25) and (27) need not hold, consider the 
symmetric model with n = c = 8, 8s; = Sp = s3 = 1, A; = {1, 2}, Ao = 
{1, 3}, As _ {2, 3}, M1 = pe = py > ie and Ai _ d2 =)\3 =a. Then b(A;) 
= 3a/(1 + 3a) for all classes j and b(i) = 2a/(1 + 3a) for all facilities 
l. Hence, for a> 1 








1 lta\ 
1 — b(A;) = i2s (; a ) = [1 — b(1)][1 — 6(2)], (29) 


so that (25) fails. On the other hand, 


2 
1 + 2a 


so that the conclusion of Theorem 1 still holds in this case. 
To see that (27) can fail too, let (Rk, ke, ks) = (1, 1, 0). Then 


2 
1 — b(A)) = ooty > ( = [1 — B(si, a1))? (30) 


P(Y,;=k;, 1sis 3) =a/(1 + 8a), (31) 
while 

3 

I] P(N(s;, o;) = Rj) = [2a/(1 + 2a)]?. (32) 


Hence, for a* < 1/8, (27) fails. On the other hand, it is easy to see that 
(28) does still hold in this example. By symmetry, it suffices to consider 
only the two triples (1, 1, 0) and (1, 0,0). O 

In summary, Example 6 shows that none of the plausible relations 
(4), (25), (26), and (27) is valid, but the validity of Theorem 1 and 
(28), which would imply Theorem 1, remains open. We now proceed 
to establish (28). 


2.4 Proof of Theorem 1 
To prove Theorem 1 we establish (28). To establish (28), we develop 
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a certain multivariate variation of Theorem 5 in Ref. 30. In particular, 
we develop a general stochastic comparison result for continuous-time 
non-Markov jump processes in which the intensities of moving into 
certain sets are always greater for one process than the other. The 
results here are a special case of the general theory developed in Ref. 
31. They also can be obtained from the related work of Massey.°** 

For our comparison result, we consider an arbitrary finite state 
space S. (It will be clear that similar results hold for infinite state 
spaces, but it suffices for us to consider a finite state space.) Let the 
space Y = Y(S) of all probability measures P on S be endowed with 
an order relation < defined by P; S P2 if P,;(A) S P2(A) for all subsets 
A of S in some class 7. (The order relation S is obviously reflexive 
and transitive, but it is not necessarily a partial order because it need 
not be antisymmetric: P, <= P, and P2 = P, together do not necessarily 
imply that P,; = P.; the relation will be a partial order if » is a 
determining class.°') Since S is finite, the order relation is closed, that 
is, it is preserved under limits: If Pin S Pon in (F, S) for all n, Pin({s}) 
— P,({s}) as n — o for each i and s € S, then P, <= P,. [In our 
application S will be a finite subset of R”, but < will not correspond 
to ordinary stochastic order on Y(S) as defined in (23).] 

The first process Y,(¢) will be a Continuous-Time Markov Chain 
(CTMC) with infinitesimal transition rates (generator) q,(s; A), de- 
fined as usual for s € S and A C S in terms of its transition function 
by 


P(Yi(t + h) € A| ¥i(t) = s) = haqy(s; A) + o(h), (33) 


for s € A, where o(h) represents a quantity that converges to zero 
after dividing by h. 

The second process Y2(t) will also be a continuous-time jump 
process with the jumps governed by infinitesimal transition rates, but 
as in Ref. 30 these rates may depend on additional information other 
than the current state, such as the history of the process. Let the 
additional information at time ¢ be I'(t), and let 7 represent a possible 
value. [In our application the process Y,(t) represents the number of 
busy servers at each facility, and the additional information I(t) is 
the number of customers of each class in service.] We assume that the 
process [Y2(t), P'(t)] is a CTMC on the product state space S x S’, 
where S’ as well as S is finite. Let q.(s, y; A) be the transition function 
for [Y2(t), I'(t)], defined by 


P([Yo(t), P(t)] € Al Yo(t) = s, P(t) = y) = haa(s, y; A) + o(h) (34) 


for (s, y) 4 A and A CS XS’. Weshall also use the transition function 
for Y2(t), defined by qo(s, y; A X S’) for A C Sands €S. 
Just as in Ref. 30, the idea here is to compare the processes Y;(t) 
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and Y.(t) by comparing the transition intensities in the space S, 
requiring that the comparisons hold uniformly in the extra information 
I(t), which must be added to Y2(t) to make Y2(t) Markov. Of course, 
a major complication here is the multidimensional state space S. 
Following Kester** and Massey,*** we exploit nonstandard stochastic 
orderings on S [consistent with (28)] and stochastic monotonicity of 
the Markov process in this ordering in order to cope with the dimen- 
sion of the state space. In particular, in our theorem, we shall assume 
that the transition function of the CTMC Yj,(t) is stochastically 
monotone.*2** 
Definition 1: A CTMC Y,(t) has a stochastically monotone transition 
function (kernel) K, = K,(s, A) = P(Yi(t) € A| Yi(0) = s) if Pik, s 
P2K, in (FP, <) whenever P; <= Po. in (FY, <), where (P;K,)(A) = 
Yses P;(s).Ki(s, A). 
Remark 1: It is significant in Definition 1 that both the condition and 
the conclusion involve the same (unspecified) order relation < on 
FP O 
Remark 2: As in Section 2 of Keilson and Kester,®® stochastic mono- 
tonicity of a CTMC Y;,(t) can be characterized by the transition rate 
function qi(s; A) and, after uniformization, by the transition function 
I + eq, of an associated discrete-time Markov chain with the same 
stationary distribution, where J is the identity map and ¢ is sufficiently 
small so that I + eq; is nonnegative. In particular, (i) (P:q,)(A) <= 
(Poqi)(A) for all A € . whenever P,; S Py» and (ii) P,(I + eqi) S 
P2(I + eq:) whenever P, Ss P», are each necessary and sufficient for 
Y,(t) to have a stochastically monotone transition function. O 

For A C S, let A‘ = S — A: Let zz be the marginal distribution of 7. 
on S, that is, 72(A) = m2(A X S’). 
Theorem 6: Suppose that the CTMCs Y,(t) and (Y2(t), I'(t)) defined 
above have unique stationary distributions 7, on S and wz, 0n S X S’. If 
(t) Y,(t) has-a stochastically monotone transition function in (#, S) 
and (i) for all A © ¥ and y € S’, qo(s, y; A X S’) S qi(s; A) for all 
s € A‘ and qos, y; AS X S’) = qi(s; A‘) for all s € A, then m2 S 7; in 
(FP, S). 
Proof: Since x2 is the unique stationary distribution of [Y2(t), '(¢)], 

0 = (m2qG2)(A) = > m2(s, y)qa(s, y; A) 


87 


for all A C S x S’. By condition (ii), 
0 = (w2qG2)(A X S’) S ¥ mo(s, y)qi(s, A) = (2qi1)(A) (35) 


S,Y 


for all A e ». Since the transition function associated with q, is 
stochastically monotone, (35) implies that x < 7z;. To see this, let Pa; 
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be the stochastically monotone transition function J + eq, of the 
associated discrete-time Markov chain constructed by uniformization. 
Then 0 S (72qi)(A) for all A € is equivalent to 72 S 72P4. Since 
Px is stochastically monotone, 7: S 7oPa S 7%2P2, S +++ S oP. 
Since 72P%, — a as n — © and < is a closed order, 72 S toPh < 
Wi. O 

Remark 1: If Y2(t) is a Markov processes, so that we do not need I(t), 
then Theorem 6 follows from Section 4.2 of Stoyan.® In fact, as 
explained in Ref. 31, Theorem 6 can also be viewed as a consequence 
of both Stoyan® and Massey.*®* O 


Remark 2: For both Markov and non-Markov processes, the conditions 
of Theorem 6 also imply stochastic comparisons for the marginal 
distributions at time ¢ for all t.°! O 


Remark 3: To relate Theorem 6 here to Theorem 5 of Ref. 30, note 
that it suffices to let one of the processes there, say Y,(t), have 
transition rates that do not depend on the extra information; that is, 
let A, (Rk, I;) = a(R) and pi(k, I,) = 61(k). (The more general case 
follows by just making two comparisons.) Then Y,(t) becomes a birth- 
and-death process on the integers, which is known to be stochastically 
monotone with the usual stochastic order for probability measures on 
the real line. Theorem 6 here thus yields stochastic order (which is 
weaker than the MLR ordering in Ref. 30) under the conditions of the 
corollary to Theorem 5 in Ref. 30. Since the stationary distribution of 
Y,(t) depends on a,(k) and 6,(k + 1) only through the rations a;(k)/ 
6,(k + 1), we can generalize the conditions here to the conditions of 
Theorem 5 in Ref. 30. In conclusion, then, Theorem 6 here yields a 
weaker conclusion (stochastic order instead of MLR order) under the 
same conditions as Theorem 5 of Ref. 30, but Theorem 6 here extends 
conveniently to the multivariate setting. 

We now apply Theorem 6 to our problem. Theorem 1 follows 
immediately from (28), which we now establish. 
Theorem 7: For each n-tuple k = (ki, ---, kn), P(Y; S ki, 1 S isn) 
= [#1 P[N(s:, a:) S i). 
Proof: We apply Theorem 6. The left and right sides of the inequality 
will be the stationary distributions of the processes Y2(t) and Y,(t), 
respectively, representing the number of busy servers at each facility 
for 1 < i < n. In both cases, we assume that the service-time distri- 
butions are exponential, which we can do without loss of generality by 
Theorem 4. The process Y2(t) represents the process of interest to us 
and the process Y;(t) is a CTMC in which the coordinate stochastic 
processes are independent. In other words, Y,(t) is the process corre- 
sponding to n independent M/M/s/loss facilities. The information 
I(t) associated with the process Y2(t) in Theorem 6 here is the number 
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of class j customers in service for each j at time ¢. It is easy to see that 
the process I(t) and the bivariate process [ Y(t), ['(t)] are CTMCs. 

To fill in the rest of the details, let the state space S be the product 
of n integer intervals and let the state space S’ for '(t) be the product 
of c integer intervals, that is, 


S=X{0,1,---,s} and S’=X (0,1,---,a}, (36) 
i=1 i=1 


where § = max{s;, 1 S i < n}. Let S be endowed with the usual partial 
order in R”; that is, k; < ke for k; = (Ra, --- , Rin) if Rij S ko; for all j. 
We shall be interested in lower subsets of S defined by 


L(k) = {k’ € S:k’ < k}. (37) 


Let .” be the set of complements of lower sets L(k) for k € S; that is, 

= {L(k)° = S — L(k):k € S}. The set .v induces a partial-order 
relation < on the space A = A(S) of all probability measures on S 
through the definition 


P, = Pp if P(A) = P(A) forall AE w. (38) 


Here < is a proper partial-order relation because .v is a determining 
class. 

It remains to show that conditions (i) and (ii) in Theorem 6 hold 
with respect to the ordering < in A(S). To see that condition (i) holds, 
that is, that qg, is stochastically monotone with respect to <, construct 
the associated discrete-time transition function Pz, = I + eq, (see 
Remark 2 before Theorem 6) and note that 


(wPxi)[L(k)] = Y p¥x[L(is + e:)] + ( -y p*) r[L(k)], (39) 


where e; is an n-tuple of all 0’s except a 1 in one place and p; is a 
probability. (The permissible values of +e; obviously depend on k, but 
it is not necessary to specify them or the probabilities p= in detail.) 
From (39), it is immediate that (7,Pu,)[L(k)] = (a2Pa)[L(k)] for all 
k ES if r,[L(k)] = 72[L(k)] for all k € S. 

To establish condition (ii), involving the comparison of the inten- 
sities, first apply Corollary 4.2 to make all the individual service rates 
identical without changing the stationary distributions being com- 
pared, as in the proof of Theorem 5. Next consider transitions upwards 
due to arrivals. Observe that for k € L(k’) 


qk; L(k’)*] = qo[k, y; L(k’)*] = 0 (40) 
unless k; = k} for some i and 
gelk, y; L(k’)‘] = glk; L(k’)‘] - (41) 
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otherwise. Make the comparison (41) by matching the intensities 
associated with each class j separately. For q2 this corresponds to a 
simultaneous jump up of one in all coordinates of A; with intensity },, 
while for q, this corresponds to a jump up of one in one of the 
coordinates of A;, each with intensity );. Strict inequality occurs in 
(41) if the simultaneous transitions are blocked by the upper boundary, 
while the corresponding individual transition is not. Inequality also 
occurs if k; = kj for two or more indices 1. Assuming that k; = k} for 
some 1 and there is no blocking at the upper boundary, the intensity 
of transition out of L(k’) is ); for gz but md); for qi, where m is the 
number of indices for which k; = k/. 

Next consider transitions downwards due to departures, where now 
all individual service rates are identical, say u. (Invoke Corollary 4.2.) 
The transition function q, differs from g, by having multiple depar- 
tures at intensity » (that depend on the classes present) instead of 
individual departures each at intensity yp. The overall intensity of a 
transition downward, therefore, can be much greater in q,, but with q; 
it is possible to enter the sets L(k’) from outside, that is, from k € 
L(k’)* only by a departure in at most one of the coordinates. In other 
words, we have 


golk, y; L(k’)] = alk; L(k’)] (42) 


for all k € L(k’)*°. Strict inequality can occur in (42) if k; = k/ + 1 for 
two or more J in A; and k; < kj otherwise when a class j customer is in 
service at time ¢. Then 


galk, y; L(k’)] = » > 0 = qilk; L(k’)). (43) 


for k € L(k’)°. Properties (40) through (43) establish condition (ii) of 
Theorem 6 in our case. LO 


Il. LARGE SYMMETRIC MODELS 


To support the reduced-load approximation in Sections 1.5 and 1.6, 
we investigate large symmetric models. The limit theorems here are 
similar in spirit to previous ones for closed networks of queues with 
unlimited waiting space in Sections V and VIII in Ref. 40. 

Here we assume that all facilities have s servers, all service-time 
distributions are exponential, all service rates are 1, all customer class 
arrival rates are \, and all customers require service from m facilities. 
We associate one class with each possible subset of size m. We let the 
number of facilities n become large with the total offered load per 
facility a held fixed. We achieve this by letting the arrival rate per 
class when there are n facilities be 


anon" 
a in /(”) = sare ae (44) 
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It seems useful to focus on the stochastic process Q,,;(t) representing 
the number of facilities with j busy servers at time ¢ in the model with 
n facilities. Obviously, Qno(t) = n — [Qni(t) +--+ + Qns(t)] so that it 
suffices to focus on j with 1 <j < s. The process [Qni(t), --+ , Qns(t)] 
is convenient because its dimension does not change as n > ©. It also 
appears that this process contains the essential information to char- 
acterize the asymptotic behavior of the blocking probability. However, 
this process presents a serious difficulty because, except in the rela- 
tively elementary special case in which s = 1, this process is not 
Markov. The future evolution of the process given any present value 
depends on additional information, namely, the specific classes pres- 
ent. However, we show that in a sense this information is asymptoti- 
cally irrelevant. 


3.1 A conjectured diffusion process limit 


Let V,,;(t) be the normalized stochastic process defined by 


Vis(t) = (Qni(t) — nBj;)/Vn, t= 0, (45) 
and let V, = V,,(t) be the vector-valued process defined by 
Vi(t) = [Vni(t), ---, Vns(t)], t20. (46) 


In the spirit of many limit theorems for closely related Markov 
processes,°!® we conjecture that V, converges in distribution to a 
multivariate diffusion process. It should be possible to establish 
weak convergence (convergence in distribution) in the function space 
D[0, ©) of right-continuous functions with left limits,®'**® but we 
support the diffusion approximation only by establishing convergence 
of the infinitesimal means. For the following conjecture, let V,,(t) be 
the stationary version (starting in equilibrium at ¢ = 0) for each n, 
which exists and is unique by Theorem 4. The conjectured limit process 
is an s-dimensional multivariate Ornstein-Uhlenbeck diffusion proc- 
ess, which is characterized by its infinitesimal means and covari- 
ances.°”*?© The infinitesimal means and covariances have the rela- 
tively simple form of Mv and 2, where v is the s-dimensional state 
vector and M and & are s X s matrices that do not depend on the 
state. 

Conjecture 5: The sequence of stationary stochastic process {V,, n = 1} 
defined in (45) and (46) converges weakly (in distribution) in the 
function space D([0, »), R*°) to a stationary multivariate Ornstein- 
Uhlenbeck diffusion process if the normalization constants 6; in (45) are 
defined by (10) and (11). 

Heuristic Argument: In support of Conjecture 5, we prove that the 
infinitesimal means of {V,} converge as n — © to those of an s- 
dimensional Ornstein-Uhlenbeck diffusion process. Even though the 
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process V,,(¢) is not Markov for each n, the infinitesimal means depend 
on the past {V,,(s), s < t} only through the present state V,,(¢) = v for 
each n. For 1 <j = s — 1, the infinitesimal means are 


m,j(U1, aaa? | Us) 


fs +s) — V,,(t) 
s 


= lim E 
s—0 





V,(u), us t, V(t) =v = (vi, ---, a) 


we yo? {vB + Vnvj-1)(ma) (2-2 ey : 


+ m(j + 1)(nBjr + Vino) — (8; + Vnv;)(ma) 
(" — np; — sees)" 


- — mj(nB; + sr) 
= n¥{6;_;ma(1 — 6)" + mj + 1)Bj1 
— Bjmo(1 — B,)""* — mjBj} + {v;i-1me(1 — 6°)" 
+ vjim(j + 1) — vjma(1 — B.)""? — vjmj}, (47) 
where 8p = 1 — (6, + --- + 8). Forj =, the infinitesimal mean is 
m,;(U1, °++ , Us) 


az nV | 7 + Vnv,-1)(ma) (n= i= see 


— ms(ng, + sri} 


~ n/?{8...ma(1 a cA ie 7 msB;} + {v,-1ma(1 — By" = sms}. (48) 


In order for m,;(v;, --- , Us) to converge as n — ©, it is necessary 
and sufficient to have the coefficients of n’/ vanish in the first terms 
of (47) and (48); that is, we need 


Bjy-r1a(1 = Bi + G a 1) Bj41 = Ball _ Pe ie + jB;,J pa eae Ls 
B.-1a(1 = B.)™ = SB. (49) 


By induction, it follows that (10) and (11) provide the unique solution 
to (49). The remaining terms in (47) and (48) provide the infinitesimal 
means of the limiting diffusion process. 

A next step to establish Conjecture 5 would be to establish conver- 
gence of the infinitesimal covariances, but the infinitesimal covari- 
ances do depend on more than the current state v for each n, and 
seem difficult to calculate. Finally, this would not actually complete 
the proof because the process V,,(t) is not Markov. [It almost would 
if V(t) were Markov by page 268 of Stroock and Varadhan.*] 
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Conjecture 6 (Corollary to Conjecture 5): The stationary random vector 
of V,,(t) is asymptotically normally distributed with zero mean vector 
asn— , 


3.2 A law of large numbers 


To establish Theorem 3 in Section 1.4, we prove a weaker result 
than Conjecture 5, namely, a functional law of large numbers for the 
process {[Q,1(t), --» , Qns(t)], £ = 0} as n — ©. For this purpose, let 


Xj (t) = n*@Qn;(t), 1 <j = 5, (50) 
and 
Xn(t) = [Xn (t), +++, Xns(E)] (51) 


for t = 0. Note that the components of X,,(t) are always nonnegative 
and their sum is at most one, so we can let the state space for X,,(t) 
be the s-dimensional simplex, say A, which is a compact subset of R°*. 

The limiting stochastic process X(t) for X,,(¢) will be a continuous 
deterministic motion, that is, a Markov diffusion process with zero 
diffusion or variance coefficient. The process {X(t), t = 0} has a 
transition function 

P[X(t) = T(t, x) | X(0) = x] = 1, 

where x € A and T(t, -) is a deterministic function mapping A into 
itself. Let T;(t, x) be the jth component of T(t, x), that is, T(t, x) = 
[T\(t, x), ---, T,(t, x)]. The function T(t, -) is characterized by its 
derivative with respect to t, say T’(x) = [Ti (x), --- , T$(x)], where 
T; (x) = d/(dt)T;(t, x), which is independent of ¢ and is essentially 
the infinitesimal generator. Let = denote weak convergence (conver- 
gence in distribution) of random elements in any space, for example, 
the state space A or the space of all sample paths D([0, «), A).°%*®° 
Theorem 8: Assume exponentially distributed service times with mean 
one. If X,(0) = X(0) in A, then X, => X in D([0, ©), A), where X(t) is 
a continuous deterministic motion with transition function T(t, x) 
having derivatives with respect to t 


Tj (x) = m[a(1 — x5)" (xj — 43) + GF + xj — JG), 7 S 8 — 1, 
Ts (x) = m[a(1 — x,)""*x5-1 — 8X5], (52) 


where K = (X1, -++ ,X;) and x» =1— (x, +--+ + 4%,). 

Proof: There are two steps, which we establish in Lemmas 1 and 2 
below. First, we show that {X,} is uniformly tight in D([0, %), A), so 
that every subsequence has a weakly convergent subsequence (see page 
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35 of Ref. 61). In the process, we show that each limit process has 
continuous sample paths. Then we show that the transition functions 
P[Xi(ti + te) € A| X,(t1) = x] converge to the transition function of 
the specified continuous deterministic motion as n — ». Moreover, 
we show that the transition probability is asymptotically Markov, that 
is, asymptotically independent of the history of the process before f,. 
By Lemma 1, there is a weakly convergent subsequence, and any 
weakly convergent subsequence, say {X,,}, has some limit process X’. 
As a consequence of the weak convergence in the function space and 
the continuous mapping theorem (Theorem 5.1 of Billingsley®'), the 
bivariate joint distributions converge weakly in A? too; that is, 


P[Xn,(t1), Xn, (t2)] © -} => P{X’(h), X’(&)] © +} 

for all t,, tp = 0. Since X,(0) = X(0), X’(0) must be distributed the 
same as X(0). Moreover, since the transition functions converge, 
the limit X’(t) must be distributed as T[t, X(0)]. Since the sample 
paths of X’ are continuous, this determines the distribution of X’ in 
D({0, ), A). Since the distribution of the limit of every weakly 
convergent subsequence of {X,} in D([0, ©), A) is determined, the 
entire sequence thus converges weakly to the determined limit, by 
Theorem 2.3 of Billingsley. O 


Lemma 1: The sequence {X,} is uniformly tight in D([0, ©), A) and the 
limit of any convergent subsequence has continuous paths. 

Proof: To establish uniform tightness in D([0, %), A), we establish the 
stronger C-tightness, conditions for which are given in Theorem 8.3 
of Billingsley.®! This implies that {X,} is also D-tight and that the 
limit of any convergent subsequence has continuous sample paths. To 
establish C-tightness, it suffices to focus on a single coordinate of {X,,} 
in D({0, 2%), R), say {X,;} (see Section 2 of Ref. 65 and Exercise 6, page 
41, of Ref. 61). Moreover, it suffices to restrict the time interval to a 
compact subinterval.***>*" Since the state space A of X,, is a compact 
subset of R°, the set of all probability measures on A with the topology 
of weak convergence is metrizable as a compact metric space (see page 
45 of Ref. 68). By Prohorov’s theorem, page 37 of Ref. 61, {X,,;(0)} is 
uniformly tight in R and condition (i) of Theorem 8.3 in Billingsley®™ 
holds. 

We establish the remaining condition (ii) by bounding the change 
in X,,;(t) in a fixed interval of length 6 by the normalized sum of all 
arrivals and all departures during that arrival. The arrivals, in turn, 
are bounded by the total number of arrivals that would occur if all 
servers remained empty throughout the interval, that is, by a Poisson 
random variable with rate nmaé. Similarly, the number of departures 
is bounded above by the number of departures that would occur if all 
facilities remained full throughout the interval, that is, by a Poisson 
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random variable with rate nmsé. These two bounds can be expressed 
via stochastic order relations, as in (23), by actually generating the 
arrivals and departures by appropriately thinning two independent 
Poisson processes with the indicated rates. 

To establish condition (ii), it remains to show that for all positive 
c, €, and 7 there exists 6 such that 


P[N(cn6) > ne] < 6n (53) 


for all n sufficiently large, where N(A) is a Poisson random variable 
with mean ). Of course, we choose 6 so that cé < « to have the mean 
of N(cné6) less than ye. Then, using Chebyschev’s inequality, we obtain 


Var N(cné) 
P[N(cn6) > ne] < [EN(cné) — ne]? 
cné cé 


<——_., = ———, 
(cnd — ne)? n(cb — €)?’ 


which shows that (53) indeed holds for all n > no, where no = 


c/[n(c6 — «)}. O 
Let A‘ be the open e-ball in A about the set A, that is, 


A‘ = {x € A: d(x, y)<« forsome y € 4A}, (54) 


where d is a metric on R*, here taken to be the maximum metric 
d(x, y) = max{|x;-y|,1 sis}. 
Lemma 2: For all positive ¢, states x € A and histories {X,,(u), u S ty}, 


lim P(X, (th + te) e [T (te; x)]‘| X,(u), us th, X,,(t1) or x) = 1, 


where T is the continuous deterministic motion in Theorem 8. 

Proof: Let I be the identity map on A. Since T(t, -) has the semigroup 
property of a Markov process and the derivative T’ is bounded and 
continuous, (J + «7”)“‘ + T(t, -) as « — 0. Consequently, it suffices 
to prove that there is a constant K such that for all sufficiently small 
positive « 


lim P(X,(t + €) € (x + eT’)*°| X,(u), u < t, X,.(t) =x) =1. (55) 


To establish (55), we use stochastic dominance arguments as in the 
proof of Lemma 1. In particular, we first observe that, for any n, t and 
¢, the total number of arrivals in the interval [¢, t + «] is stochastically 
dominated by a Poisson variable with mean nmae. Similarly, for any 
n, t, and e, the total number of departures in the interval [t, t + e] is 
stochastically dominated by a Poisson variable with mean nmse. These 
stochastic bounds give us initial bounds on how much X,,(w) can differ 
from X,,(t) in the interval [t, t + «] for all possible histories. Since 
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X,,;(t) is a proportion, we can apply a law of large numbers for Poisson 
variables as the rate increases. In particular, there is a constant K, 
which is independent of ¢ as e — 0, such that 


lim al sup |X,;(u) — X,;(t)| > Ke| X,(u), 


n—0 tsustt+e 


ust, X,(t)= x] = 0. (56) 


We now use the initial bound in (56) to produce better bounds on 
X,,(t + «) — X,,(t), that is, to establish (55). Given that X,,(t) = x and 


sup |X,j(u) — Xqy(t)| < Ke 


tsustt+e 


for 1 <j < ss, the actual flow rate into state j (the rate of increase of 
Xnj(u)) in the interval [t, t + «] is bounded above by 


I*(j) = & min{1, (1 — x, + Ke)" "}(xj-1 + Ke) 
+ (j + 1)(xj+1 + Ke) 
< a min{1, (1 — x, + Ke)” *}aj-1 + @Ke + (J + 1)xja1 
+ (j + 1)Ke 
< A(1 — x5)" xj-1 + (7 + Uxjer + (am + (7 + 1))Ke (57) 
and bounded below by 
I'(j) = & max{0, (1 — x, — Ke)" 4}(x-1 — Ke) 
+ (j + 1)(xj41 — Ke) 
> a(1l — x.) xj + (J + Uxja1 — [am + (7 + 1)]Ke. (58) 


In other words, with n facilities the flow into state j for the unnor- 
malized process Q,,;(t) is stochastically bounded above by a Poisson 
process with rate nI“(j) and stochastically bounded below by a Pois- 
son process with rate nI'(j).® Similarly, the flow rate out of state j 
[the rate of decrease of X,;(u)] in the interval [t, t + «] is bounded 
above by 


O“(j) = & min{1, (1 — x, + Ke)" "}(x; + Ke) + j (xj + Ke) 


< a(1 — x.) x; + jx; + (am + 7) Ke (59) 
and bounded below by 
O'(j) = & max{0, (1 — x, — Ke)™ "}(x; — Ke) + j (x; — Ke) 
> a(1 — x5)" xj; + jx; — (am + j)Ke. (60) 


We invoke a well-known functional law of large numbers for the 
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Poisson process (which is a consequence of the functional central limit 
theorem, Section 17 of Billingsley®') to deduce that as n — © the 
change in X,,(w), that is, the change in the proportions, is bounded 
above and below by the deterministic motions with rates I“(j ) — O'(7) 
and I'(j ) — O“(j), respectively. Hence, for any history {X,(w), u < ¢} 
and any state X,,(t) = x, | 


lim P{efI'(j) — O"(7)] S Xni(t + ) — Xnj(t) 


< I*(j) — O'(7j)) | X,(u), u st, X,(t) = x} =1, (61) 
but 
eI“(j) — O'(7)] = eTy (x) + PK’ 
and 
eI'(j) — O"(j)] = Tj (x) — 2K’ 

for K’ = (2am + 2j + 1)K, so that (61) is equivalent to the desired 
result. O 

We now describe the limiting continuous deterministic motion X(t) 
specified in Theorem 8. In particular, we verify that X(t) has a unique 
stationary distribution and converges to it as t — © for any initial 
distribution. It is relatively elementary that T(t, -) has a unique fixed 
point in A. We want to establish the stronger result that T(t, -) hasa _ 
unique fixed point in the space Y(A) of all probability measures on A. 
To appreciate the difference, note that clockwise circular motion at 
constant angular velocity in the plane has a unique fixed point in the 
plane, namely, the origin, but the uniform distribution over any circle 
centered about the origin is a stationary distribution for this clockwise 
circular motion. We show that our continuous deterministic motion 
actually converges to its unique fixed point in A for every initial 
distribution. 
Theorem 9: For any initial vector y, X(t) — B as t > ©, where B = 
(Bi, --- , 8;) is determined by (10) and (11). 
Corollary 9.1: The limiting continuous deterministic motion X(t) has a 
unique stationary distribution, which is a unit mass on the vector B 
determined by (10) and (11). 
Proof: We write T,(A;) — Az as t — © for subsets A; and A, of A to 
represent that T(t, y) — Az as t > & for all y € Aj, that is, d(T(t, y), 
A2) — 0 as t — ©, where d(x, A) = inf{d(x, y): y € A} with d the 
metric on R*'. Equivalently, T(t, y) — Az as t > © if the limits of all 
convergent subsequences {7(t;, y), k = 1, 2, ---} of {T(é, y), t = 0} 
with t, —> © are contained in A. (Since A is a compact metric space, 
every sequence has a convergent subsequence. Moreover, the limit sets 
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A, considered below will be closed, so that they will contain the limits.) 
Our goal is to show that T;(A) — {8}. To do so, we construct compact 
subsets [,, --- , L, such that 


L, = {pg} CL1G---CI,CA, (62) 


T,(A) — L, and T;(Ly) @ Lei, 1 S RSs — 1, ast—~™. Since T; has 
the semigroup property T(t, + t2, x) = Tlt,, T(te, x)] for all x, t; and 
t2, and is continuous, this implies that 7;(A) — L, for all k, so that 
T,(A) > {8 }. 

We consider real-valued functionals of T(t, -). First we consider the 
net flow into the set {1, --- , s}, defined by 


F.(x) = ) Ty (t, x) 
j=l 
with derivative 
F; (x) aaa a(l ~~ Ray _ > 5 (63) 
jel 


which is continuous and strictly decreasing in x. Moreover, for 
all x sufficiently large, Fj(x) < 0; and for all x sufficiently small, 
Fj(x) > 0. Consequently, F,,.(x) — 0, Fj[T(t, x)] — 0 and T,(A) > 
L, as t > ©, where 


LT, = {x € A: F3(x) = 0}. (64) 


Next consider the net flow into the states {1, --- , s — 1}, defined 
by 


s~-1 
F(s-1),(X) a »y T;(t; x) 
j=l 
with derivative 


s—1 
Fs-1(x) = a(l [= x) > CL — Xs — Xs-1) _ »» Jx;j 
jJ=1 


Ea —x)"—- > is| + [sx, — x,1a(1 — x,)"] 
j=l 


= Fi (x) + [sx, — x.-1a(1 — x,)7"). (65) 


For x € L,, Fj (x) = 0, and F}_1(x) = [sx, — x,-1a(1 — x,)"""], which 
is continuous and strictly decreasing in (x,-1, — x,). For all x € Ll; 
with x,_, sufficiently large (small) and x, sufficiently small (large), 
Fy-1(x) < 0 (>0). Hence, F,.-1:(x) — 0 and Fj_,[T(t, x)] — 0 for 
x € L,, and T;(L,) — Lz as t > ©, where 


Le _ {x E Ly: F3-1(x) = 0}. (66) 
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Similarly, we consider the net flow F.-2):(x) into the states {1, ---, 
s — 2} with derivative 


Fi_o(x) = Fi(x) + Fia(x) + (8 — 1)x.-1 — x5-2@(1 — x)", (67) 


which is continuous and strictly decreasing in (x,-2, X;-1, —Xs). More- 
over, for all x € Le with x,-2 sufficiently large (small) and x,-; and x, 
sufficiently small (large), Fs_2(x) <0 (>0). Hence, T;(L2) — L3, where 


Ls = {x © Ly: F!-o(x) = 0}. (68) 


The proof is completed by induction. The s equations F4(x) = 0, 
1<k<s, uniquely determine the fixed point 8 of T(t, -) in A defined 
by (10) and (11). These are the partial balance equations for a single 
M/M/s/loss facility.® Hence, L, = {6} and T;(A) > {@} aat>o~. O 


3.3 Proof of Theorem 3(a) 


Proof: We now apply Theorems 8 and 9 to prove Theorem 3(a). Let 
Zn = (Zn, °*+, Zns) have the unique stationary distribution of {X,(t), 
t = 0} for each n. (Existence and uniqueness follow from Theorem 4.) 
Since the state space for {Z,} is the compact simplex A in R*, the 
sequence {Z,,} is uniformly tight and has a weakly convergent subse- 
quence, say {Z,,,}; apply the argument in the proof of Theorem 8. Since 
Zn, = Zin A as m — © for some Z, the stationary versions of the 
stochastic processes X,,(t) converge weakly (in distribution) in 
D({0, ©), A) and n;, — © to the continuous deterministic motion X(t) 
with X(0) distributed as Z (applying Theorem 8). However, since 
X,,,(t) is stationary for each nz, so is X(t). By Corollary 9.1, the only 
stationary distribution for X(t) is the limiting vector 6. Hence, we 
must have P(Z = 6) = 1. Since every convergent subsequence of {Z,,} 
has the same limit Z, we must have convergence of the entire sequence, 
that is, Z, = Z in A (see Theorem 2.3 of Ref. 61). Since P(Z = 8) = 1 
for the deterministic vector 6, we have convergence in probability (see 
page 25 of Ref. 61). O 

Remark: Theorems 3, 8, and 9 together imply that the stationary 
versions of the stochastic processes X,,(t) also satisfy a functional law 
of large numbers in D([0, ©), A). 


3.4 Proof of Theorem 3(b) 


The key to Theorem 3(b), of course, is Theorem 3(a) and the 
symmetry: Every subset of size m is equally likely to be the set of m 
required facilities for each arrival. In addition to Theorem 3(b) we 
establish a stronger form of asymptotic independence, for the stochas- 
tic processes instead of only the stationary distributions. Let Y,;(t) be 
the number of busy servers at facility i at time t. Let Y,(t) = [Yi (t), 

- , Ynn(t)] be the stationary version for each n. 
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Theorem 10: For any finite subset H and any to, the stationary stochastic 
processes {Y,;(t), 0 = t S to}, 1 © H, are asymptotically independent as 
n—> , 

Proof: By symmetry, the joint distribution of { Y,,;(t), ¢ ¢ H} is invariant 
under a permutation of the indices. By Theorem 3a, the proportion of 
facilities with j busy servers converges in probability to 6; as n > ~. 
Hence, by symmetry, lim,_,.. P[Yni(0) = ji, 1 s i S H] = [lia 8;,, so 
that the initial stationary values Y,;(0), 1 © J, are asymptotically 
mutually independent. Next, let A,;(t) be the arrival process to facility 
t excluding losses due to blocking elsewhere. By Theorem 3a, Aj;(t) 
converges to a Poisson process with rate a(1 — 8,)"" as n > ©, 
Moreover, again by symmetry and Theorem 3a, the arrival processes 
{A,i(t), 0 < ¢t S to}, i € H, are asymptotically mutually independent as 
n— o, Since probability that the facilities in H share any customers 
at any time in the interval [0, t)] is asymptotically negligible as n > 
oo, the departure processes for i © H and thus also the processes 
{Y,i(t), 0 <t< to}, 1€ H, are asymptotically mutually independent. O 


IV. EXISTENCE, UNIQUENESS, AND INSENSITIVITY 


We now prove Theorem 4. 


Proof: In the case of exponentially distributed service times, the vector- 
valued stochastic process, say [Ni(t), ---, N.(t)], representing the 
number of class j customers in service at time ¢ for allj, 1 <j Sc, is 
an irreducible c-dimensional continuous-time Markov chain with a 
finite state space. Hence, there exists a unique stationary distribution. 
It is easy to see that the claimed distribution in Theorem 4 is the 
steady-state distribution by making the standard partial balance anal- 
ysis.°*® The same steady-state distribution holds for general service- 
time distributions by the insensitivity results, which we discuss further 
below. 

To prove the rest of Theorem 4, we need to establish that the steady- 
state distribution of (Ni, ---, N.) is actually well defined. For this 
purpose, we construct a continuous-time vector-valued Markov proc- 
ess {Z(t), t = 0}, depicting the number of class j customers in service 
for each j and the remaining service time of each at time ¢. [Z(t) is 
the continuous-time Markov process associated with the GSMP in 
Ref. 46.] The steady-state distribution in Theorem 4 is understood to 
be the marginal distribution corresponding to (N,, ---, N.) of the 
stationary distribution of Z(t). We shall show that Z(t) indeed has a 
stationary distribution (without establishing uniqueness) and that the 
marginal distribution corresponding to (Ni, ---, N.) is always as 
claimed in Theorem 4 and so is unique. 

For our given general service-time distributions, we construct se- 
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quences of approximating service-time distributions from finite mix- 
tures of finite convolutions of exponential distributions, as in Section 
3.3 of Ref. 6 and in the proof of Theorem 2 in Ref. 46. We construct 
this so that the means are unchanged and there is weak convergence 
to the given distributions. For our special model, it is easy to see that 
each continuous-time Markov process Z(t) so created with these 
approximating service-time distributions has a unique invariant prob- 
ability measure. Existence follows from the theory of continuous-time 
Markov chains with finite-state space. Uniqueness follows from the 
irreducibility that is evident from our special structure. Moreover, the 
partial balance property satisfied by the steady-state distribution in 
the exponential case implies that the unique stationary distribution 
of Z(t) in each approximating case has marginal distribution for 
(N,, ---,.N.) as specified in Theorem 4.°*** Finally, we treat the case 
of the original general service-time distributions by continuity, invok- 
ing Theorem 3 of Ref. 46. (Note that uniqueness with the approxi- 
mating service-time distributions is crucial for that theorem.) This 
continuity theorem implies that the process Z(t) indeed has a station- 
ary distribution and that the marginal distribution corresponding to 
(N,, --- ,.N,) is as claimed for every stationary distribution of Z(t). O 
Remark: An alternate proof of existence and uniqueness can be con- 
structed using the fact that arrival epochs when the system is empty 
constitute regeneration points. The GSMP theory is also useful for 
describing steady state in more general models for which this is not 
the case; for example, if the service-time distributions are nonexpo- 
nential and the arrival processes for the different classes are inde- 
pendent non-Poisson renewal processes. However, the insensitivity is 
typically lost with this extension. 


V. CONVERGENCE OF THE SUCCESSIVE APPROXIMATION 

ALGORITHM 

Example 3 in Section 1.5 showed that the successive approximation 
scheme (8) need not converge. In this section we show that if the 
offered loads are sufficiently small, then the operator T' defined by the 
right side of (7) is a contraction operator, so that it has a unique fixed 
point to which successive iterates of T converge geometrically fast. 
However, the conditions for this property are quite strong, so that the 
theorem does not nearly cover all practical cases. 

To state our results, let ||-|| be the supremum norm on R” defined 
by [|x|] = max{|x;|:1 < is n} for x = (m4, «++, xX,). Let a,(b) be the 
reduced offered load as a function of b = (b,, --- , 6,) as defined in 
(6). Let y(b) be defined by 


+ (b) = mae (a5 —-1+ , bnésh : (69) 
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Let U = (U;, --- , U,) be an upper bound on any solution b* of (7) 
such as (B(s;, G1), --:, B(Sn, Gn)) = T7(1) or T**(1) for any k = 1. 
Theorem 11: If y(U) <1 for y in (69) and the upper bound U to any 
solution of (7), then 
(i) ||T(b') — T(b’) || < y(U) || b’ — b? || for all b? and b’ in R” with 0 
< bi, b? < U; for all i, so that 
(it) T has a unique fixed point b* in [0, U] = {b: 0 s b; s U;}, and 
(iii) || T*(b°) — b* || Ss y(U)* || b®° — b* || for all k when the initial vector 
T°(b) = b° is in [0, U]. 
Proof: Parts (ii) and (iii) follow from (i) by the Banach-Picard fixed- 
point theorem for a contraction map on a complete metric space.”° For 
(i) it suffices to have 


dT;(b) 


2 y(U) 
Ob; 


n 








for all i and k (for example, see Theorem 2, page 111 of Ref. 70). By 
Theorem 15 of Jagerman,”’ 











ie E —~1+ Bis, | B(s, a). 
da a 

Hence, 

OT;(b) Si . — yU) 

| ab, < E (b) 1+ | bia; < — 
for b € [0, U]. 
Remark 1: For the symmetric model, (69) simplifies to 

nsU , 
y(U) = G@-u™=" (1 — U)Una, (70) 


so that a simple sufficient condition for the condition of theorem 11 
is 
nsU 2 
(1 = U) m-1 


Remark 2: If U; = B(s;, &;) for all i or if U = T”*(1) for some k 
using (8), then U is an increasing function of the offered loads 
(a1, +--+, Gn) Or (a1, --+ , &) because T is an increasing function of b 
and B(s, a) is an increasing function of a. Hence, if the offered loads 
are sufficiently small, then the vector U will be sufficiently small, so 
that the condition of Theorem 11 will eventually hold. 


1. (71) 


VI. CONCLUSIONS 
We have investigated a model to describe the blocking probabilities 
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when service is required from several multiserver facilities simulta- 
neously. We have shown in Theorem 1 that some standard approxi- 
mations produce upper bounds. In the process, we have established 
several other useful stochastic comparison results (Theorems 5 
through 7 and Ref. 31). We also have proposed an improved reduced- 
load approximation and developed an efficient algorithm (Theorem 2) 
to treat both the Poisson arrival case (Section 1.5) and the non- 
Poisson arrival case (Section 1.9). In Theorem 8 we have established 
a functional law of large numbers that implies that the symmetric 
reduced-load approximation is asymptotically correct for symmetric 
models as the number of facilities increases with the offered load per 
facility and the number of facilities per class held fixed (Theorems 3 
and 10). We have displayed the exact formula in Theorem 4 and 
justified the insensitivity with respect to the service-time distributions 
(Sections 1.8 and IV). 

Among the important directions for future research are (i) testing 
the approximations further, especially for non-Poisson arrival proc- 
esses; (ii) establishing better conditions for the reduced-load equations 
(7) to have a unique solution (Conjectures 1 and 2); (iii) establishing 
better conditions for the successive approximation scheme (8) to 
converge; (iv) establishing lower bounds on the exact blocking proba- 
bilities paralleling the upper bounds in Theorem 1; (v) determining if 
the reduced-load approximation is an upper bound on the exact 
blocking probability for symmetric models (Conjecture 3); (vi) deter- 
mining if the exact blocking probabilities for symmetric models are 
increasing in n when the offered load per facility is fixed (Conjecture 
4); (vii) establishing (if possible) the diffusion limit in Section III 
(Conjectures 5 and 6); (viii) seriously analyzing smaller models in 
which the basic facility-independence approximation in (5) underlying 
all the approximations here is not appropriate.’ In particular, in the 
spirit of Kaufman’ and Mitra and Weinberger,” it would be nice to 
develop an efficient algorithm for the exact blocking probabilities in 
Theorem 4 and Corollary 4.1. 

It would also be of interest to consider other related models, for 
example, models in which more than one server per facility may be 
required, and related delay systems. There are two kinds of waiting to 
be considered for delay systems: waiting for each customer class 
outside the system, and waiting for service at each facility within the 
system. The second form of waiting may still require simultaneous 
service or some other form.“ 
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VII. EPILOGUE 


This section has been added in proof to report important new work. 
Kelly” has proved that the reduced-load system of eq. (7) has a unique 
solution, thus confirming Conjectures 1 and 2. Kelly also has proved 
that the reduced-load approximation is asymptotically correct in heavy 
traffic, that is, in a network with fixed topology in which a; > © and 
$s; > ©, as in Ref. 57. In fact, Kelly’s heavy-traffic limit theorem is a 
multifacility generalization of the local limit theorem in the Appendix 
of Ref. 57. 

Ziedens and Kelly” also have proved limit theorems similar to 
Theorem 3 for symmetric networks in which the number of nodes 
increases. For the special tree networks in Fig. 1, Mitra” has deter- 
mined an efficient algorithm for the exact solution based on asymptotic 
expansions, in the spirit of Ref. 21. Other related work appears in 
Refs. 74 through 76. 


REFERENCES 


1. D. Bear, Principles of Telecommunication-Traffic Engineering, London: The Insti- 
tution of Electrical Engineers, 1976. 

2. D. D. Sheng, “Performance Analysis Methodology for Packet Network Design,” 
IEEE Global Telecommun. Conf., GLOBECOM ’83 (December 1983), pp. 456-60. 

3. C. L. Monma and D. D. Sheng, unpublished work. 

4, E. Cinlar, Introduction to Stochastic Processes, Englewood Cliffs, New Jersey: 
Prentice-Hall, 1975. 

5. R. W. Wolff, “Poisson Arrivals See Time Averages,” Oper. Res., 30, No. 2 (March- 
April 1982), pp. 223-31. 

6. F. P. Kelly, Reversibility in Stochastic Networks, New York: Wiley, 1979. 

7. J. S. Kaufman, “Blocking in a Shared Resource Environment,” IEEE Trans. 
Commun., COM-29, No. 10 (October 1981), pp. 1474-81. 

8. D. Y. Burman, J. P. Lehoczky, and Y. Lim, “Insensitivity of Blocking Probabilities 


BLOCKING 1853 


in Sane ne Network,” J. Appl. Probab., 21, No. 4 (December 1984), 
pp. 850-59. 

9. J. M. Holtzman, “Analysis of Dependence Effects in Telephone Trunking Net- 
works,” B.S.T.J., 50, No. 8 (October 1971), pp. 2647-62. 

10. V. E. Benes, “Models and Problems of Dynamic Memory Allocation,” Applied 
Probability—Computer Science: The Interface, Vol. I, ed. R. L. Disney and T. J. 
Ott, Boston: Birkhauser, 1982, pp. 89-135. 

11. E. G. Coffman, Jr., T. T. Kadota, and L. A. Shepp, “A Stochastic Model of 
Fragmentation in Dynamic Storage Allocation,” SIAM J. Comput., 14, No. 2 
(May 1985), pp. 416-25. 

12. G. F. Newell, “The M/M/o Service System With Ranked Servers in Heavy Traffic,” 
Peale Notes in Economics and Math. Systems, 231, New York: Springer-Verlag, 
1984. 

13. L. A. Gimpelson, “Analysis of Mixtures of Wide and Narrow Band Traffic,” IEEE 
Trans. Commun. Technol., 13 (1965), pp. 258-66. 

14. E. Wolman, “The Camp-On Problem for Multiple-Address Traffic,” B.S.T.J., 51, 
No. 6 (July-August 1972), pp. 1363-422. 

15. K. J. Omahen, “Capacity Bounds for Multiresource Queues,” J. Assoc. Comput. 
Mach., 24, No. 4 (October 1977), pp. 646-63. 

16. E. Arthurs and J. S. Kaufman, “Sizing a Message Store Subject to Blocking 
Criteria,” Performance of Computer Systems, ed. M. Arato, A. Butrimenko, and 
E. Gelenbe, Amsterdam: North-Holland, 1979, pp. 547-64. 

17. L. Green, “A Queueing System in Which Customers Require a Random Number of 
Servers,” Oper. Res., 28, No. 6 (November-December 1980), pp. 1835-46. 

18. P. H. Brill and L. Green, “Queues in Which Customers Receive Simultaneous 
Service From a Random Number of Servers: A System Point Approach,” Manage. 
Sci., 30, No. 1 (January 1984), pp. 51-68. 

19. L. Green, “A Multiple Dispatch Queueing Model of Police Patrol Operations,” 
Manage. Sci., 30, No. 6 (June 1984), pp. 653-64. 

20. A. Federgruen and L. Green, “An M/G/c Queue in Which the Number of Servers 
Required is Random,” J. Appl. Probab., 21, No. 3 (September 1984), PP: 583-601. 

21. D. Mitra and P. J. Weinberger, “Probabilistic Models of Database Locking: Solu- 
tions, Computational Algorithms, and Asymptotics,” J. Assoc. Comput. Mach., 
31, No. 4 (October 1984), pp. 855-78. 

22. D. Mitra, unpublished work. 

23. D. P. Heyman, “Asymptotic Marginal Independence in Large Networks of Loss 
Systems,” Bell Communications Research, Holmdel, 1985. Presented at the 
ORSA/TIMS Applied Probability Conf., Williamsburg, Va., January 1985. 

24. W. J. Hery, private communication. 

25. J. T. Wittbold, AT&T Communications, private communication. 

26. J. M. Akinpelu, “The Overload Performance of Engineered Networks With Non- 
hierarchial and Hierarachial Routing,” AT&T Bell Lab. Tech. J., 63, No. 7 
(September 1984), pp. 1261-82. 

27. D. L. Jagerman, “Some Properties of the Erlang Loss Functions,” B.S.T.J., 53, No. 
3 (March 1974), pp. 525-51. 

28. D. L. Jagerman, “Methods in Traffic Calculations,” AT&T Bell Lab. Tech. J., 63, 
No. 7 (September 1984), pp. 1283-310. 

29. W. Feller, An Introduction to Probability Theory and Its Applications, Vol. I, Third 
Edition, New York: Wiley, 1968. 

30. D. R. Smith and W. Whitt, “Resource Sharing for Efficiency in Traffic Systems,” 
B.S.T.J., 60, No. 1 (January 1981), pp. 39-55. 

31. W. Whitt, unpublished work. 

32. D. J. Daley, “Stochastically Monotone Markov Chains,” Zeitschrift Wahrschein- 
lichkeitstheorie Verw. Geb., 10 (1968), pp. 305-17. 

33. J. Keilson and A. Kester, “Monotone Matrices and Monotone Markov Processes,” 
Stoch. Proc. Appl., 5, No. 3 (July 1977), pp. 231-41. 

34. A. Kester, Preservation of Cone Characterizing Properties of Markov Chains, Ph.D. 
Thesis, University of Rochester, 1977. 

35. D. Stoyan, Comparison Methods for Queues and Other Stochastic Models, ed. D. J. 
Daley, New York: Wiley, 1983. 

36. W. A. Massey, “An Operator Analytic Approach to the Jackson Network,” J. Appl. 
Probab., 2 (June 1984), pp. 379-93. 

37. W. A. Massey, “Open Networks of Queues: Their Algebraic Structure and Estimat- 
ne Can Transient Behavior,” Adv. Appl. Probab., 16, No. 1 (March 1984), pp. 
176-201. 

38. W. A. Massey, unpublished work. 


1854 TECHNICAL JOURNAL, OCTOBER 1985 


39. N. Dunford and J. T. Schwartz, Linear Operators, Part I: General Theory, New 
York: Interscience, 1958. 

40. W. Whitt, “Open and Closed Models for Networks of Queues,” AT&T Bell Lab. 
Tech. J., 63, No. 9 (November 1984), pp. 1911-79. 

41. E. Fuchs and P. E. Jackson, “Estimates of Distributions of Random Variables for 
Certain pO unicesons Models,” Commun. ACM, 13, No. 12 (December 1970), 
pp. 752-7. 

42. P. F. Pawlita, “Traffic Measurements in Data Networks, Recent Measurement 
Results, and Some Implications,” IEEE Trans. Commun., COM-29, No. 4 (April 
1981), pp. 525-35. 

43. W. 'T. Marshall and S. P. Morgan, unpublished work. 

44, R. Schassberger, “Insensitivity of Steady-State Distributions of Generalized Semi- 
Markov Processes With Speeds,” Adv. Appl. Probab., 10, No. 4 (December 1978), 
pp. 836-51. 

45. D. Y. Burman, “Insensitivity in Queueing Systems,” Adv. Appl. Probab., 13, No. 4 
(December 1981), pp. 846-59. 

46. W. Whitt, “Continuity of Generalized Semi-Markov Processes,” Math. Oper. Res., 
5, No. 4 (November 1980), pp. 494-501. 

47. E. Brockmeyer, H. L. Halstrom, and A. Jensen (eds.), The Life and Works of A. K. 
Erlang, Copenhagen: Danish Academy of Sciences, 1948. 

48. F. Baskett et al. , “Open, Closed and Mixed Networks of Queues With Different 
ae of Customers,” J. Assoc. Comput. Mach., 22, No. 2 (April 1975), pp. 248- 


49, F. P. Kelly, “Networks of Queues,” Adv. Appl. Prob., 8, No. 2 (June 1976), pp. 416- 


50. R. "Schassberger, “The Insensitivity of Stationary Probabilities in Networks of 
Queues,” Adv. Appl. Probab., 10, No. 4 (December 1978), pp. 906-12. 

51. K. Matthes, “Zur Theorie der Bedienungsprozesse,” Trans. Third Prague Conf. Inf. 
Theory, Prague, 1962. 

52. A. D. Barbour, “Networks of Queues and the Method of Stages,” Adv. Appl. Probab., 
8, No. 3 (September 1976), pp. 584-91. 

53. S. S. Lam, “Queueing Networks With Population Size Constraints,” IBM J. Res. 
Develop., 21, No. 4 (July 1977), pp. 370-8. 

54. P. Franken et al., Queues and Point Processes, Berlin: Akademie-Verlag, 1981. 

55. A. E. Eckberg, “Generalized Peakedness of Teletraffic Processes,” Proc. Tenth Int. 
Teletraffic Congress, Montreal, June 1983, p. 4.4 b.3. 

56. A. A. Fredericks, “Approximating Parcel Blocking via State Dependent Birth 
Rates,” Proc. Tenth Int. Teletraffic Congress, Montreal, June 1983, p. 5.3.2. 

57. W. Whitt, “Heavy-Traffic Approximations for Service Systems With Blocking,” 
AT&T Bell Lab. Tech. J., 63, No. 5 (May-June 1984), pp. 689-708. 

58. R. E. Barlow and F. Proschan, Statistical Theory of Reliability and Life Testing, 
New York: Holt, Rinehart and Winston, 1975. 

59. S. Karlin and Y. Rinott, “Classes of Orderings of Measures and Related Correlation 
Inequalities. I. Multivariate Totally Positive Distributions,” J. Multivar. Anal., 
10, No. 4 (December 1980), pp. 467-98. 

60. T. Kamae, U. Krengel, and G. L. O’Brien, “Stochastic Inequalities on Partially 
Ordered Space,” Ann. Probab., 5, No. 6 (December 1977), pp. 899-912. 

61. P. Billingsley, Convergence of Probability Measures, New York: Wiley, 1968. 

62. W. Whitt, “On the Heavy-Traffic Limit Theorem for GI/G/coo Queues,” Adv. Appl. 
Probab., 14, No. 1 (March 1982), pp. 171-90. 

63. D. W. Stroock and S. R. S. Varadhan, Multidimensional Diffusion Processes, New 
York: Springer-Verlag, 1979. 

64, T. Lindvall, “Weak Convergence of Probability Measures and Random Functions 
in the Function Space D[0, «),” J. Appl. Probab., 7 (March 1973), pp. 109-21. 

65. W. Whitt, “Some Useful Functions for Functional Limit Theorems,” Math. Oper. 
Res., 5, No. 1 (February 1980), pp. 67-85. 

66. L. woes Stochastic Differential Equations: Theory and Applications, New York: 

iley, 1974. 

67. W. Whitt, “Weak Convergence of Probability Measures on the Function Space 
C[0, 0),” Ann. Math. Statist., 47, No. 3 (June 1970), pp. 939-44. 

68. ae Parthasarathy, Probability Measures on Metric Spaces, New York: Academic 

ress, 1967. 

69. W. Whitt, “Comparing Counting Processes and Queues,” Adv. Appl. Probab., 13, 
No. 1 (March 1981), pp. 207-20. 

70. E. Isaacson and H. B. Keller, Analysis of Numerical Methods, New York: Wiley, 
1966. 


BLOCKING = 1855 


71. F. P. Kelly, “Blocking Probabilities in Large Circuit-Switched Networks,” Statistical 
Laboratory, University of Cambridge, England, 1985. 

72. J. B. Ziedens and F. P. Kelly, “Loss Probabilities in Circuit-Switched Star Net- 
works,” Statistical Laboratory, University of Cambridge, England, 1985. 

73. D. Mitra, unpublished work. 

74. P. M. Lin et al., “Analysis of Circuit-Switched Networks Employing Originating 
Office Control With Spill Forward,” IEEE Trans. Commun., COM-26, No. 6 
(June 1978), pp. 754-65. 

75. A. Girard and Y. Ouimet, “End-to-End Blocking for Circuit-Switched Networks: 
Polynomial Algorithms for Some Special Cases,” IEEE Trans. Commun., COM- 
31, No. 12 (December 1983), pp. 1269-73. 

76. G. Iazolla, P. J. Courtois, and A. Hordijk, Mathematical Computer Performance and 
Reliability, Amsterdam: North-Holland, 1984. 


AUTHOR 


Ward Whitt, A.B. (Mathematics), 1964, Dartmouth College; Ph.D. (Opera- 
tions Research), 1968, Cornell University; Stanford University, 1968-1969; 
Yale University, 1969-1977; AT&T Bell Laboratories, 1977—. At Yale Uni- 
versity, from 1973-1977, Mr. Whitt was Associate Professor in the depart- 
ments of Administrative Sciences and Statistics. At AT&T Bell Laboratories 
he is in the Operations Research Department of the Systems Analysis Center, 
where the primary mission is to investigate and improve the product realization 
process. 


1856 TECHNICAL JOURNAL, OCTOBER 1985 


AT&T Technical Journal 
Vol. 64, No. 8, October 1985 
Printed in U.S.A. 


Performance Comparison of InGaAsP Lasers 
Emitting at 1.3 and 1.55 um for Lightwave 
System Applications 


By N. K. DUTTA,* R. B. WILSON,' D. P. WILT,* P. BESOMI,* 
R. L. BROWN,* R. J. NELSON,' and R. W. DIXON* 


(Manuscript received April 11, 1985) 


Experimental results relative to the performances of real index-guided 
InGaAsP lasers emitting near 1.3 and 1.55 um are described and compared. 
The laser structures discussed are the etched mesa buried heterostructure, 
channeled substrate buried heterostructure, and the double channel planar 
buried heterostructure. The effect of Auger recombination and intervalence 
band absorption on the threshold current and external differential quantum 
efficiency is discussed. The effect of the larger Auger coefficient at 1.55 um is 
compensated by a lower carrier density at threshold at 1.55 um so that the 
total nonradiative current loss for lasers emitting at 1.55 um is not significantly 
larger than that for lasers emitting at 1.3 um. A small linear shunt leakage 
current (~10 mA) can increase the Ty to ~100K. We report threshold currents 
as low as 11 and 15 mA (at 30°C) and continuous-wave operating temperatures 
as high as 130 and 110°C for lasers emitting at 1.3 and 1.55 um, respectively. 


I. INTRODUCTION 


Lightwave transmission systems are being installed throughout the 
world at a rapidly escalating pace. These new systems offer higher bit 
rate and longer repeater spacing than conventional systems and thus 
reduce the transmission cost per bit.’ 
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Most of the lightwave systems installed so far (often called first- 
generation systems) use multimode fibers and an operating wavelength 
of about 0.85 um.” Second-generation systems using single-mode fibers 
and an operating wavelength near 1.3 um offer longer repeater spacings 
because the loss of silica fibers is lower at 1.3 wm than at 0.85 um.?" 
Zero chromatic dispersion near 1.3 ym for silica fibers allows the use 
of multimode laser sources in single-mode fibers for long-distance high 
bit-rate transmission without significant dispersion penalty.”° Al- 
though the full potential of lightwave systems operating at.1.3 wm has 
not yet been realized because it is a relatively new technology, it is 
quite conceivable that a third-generation system operating at 1.55 um 
may be extensively installed in the near future because the silica fiber 
loss is minimum at 1.55 wm.‘ Laboratory experiments have already 
demonstrated the high system performance that can be achieved at 
this wavelength.*® 

Development and subsequent installation of new lightwave systems 
have been driven by the development of new high-quality components, 
namely fibers, sources (lasers), detectors (avalanche photodiode or pin 
photodiode), or integrated front ends, and transmitter and receiver 
electronics. For example, the development of low-cost silica fibers with 
zero dispersion at 1.55 um will certainly influence the third-generation 
systems. High-speed integrated circuits for receiver and transmitter 
packages are currently being developed for high bit-rate (>1 Gb/s) 
systems. This paper compares the performance of InGaAsP lasers 
emitting at 1.3 and 1.55 um. The former is currently in use in several 
second-generation systems and the latter can be a single-frequency 
source for third-generation systems operating at that wavelength using 
conventional single-mode silica fibers. 

The performance requirements for lasers used in lightwave trans- 
mitters usually include linear (kink-free) light-current characteristics 
up to a certain power output (typically ~5-mW/facet, ~0-dBm power 
input to the fiber), capability of high bit-rate modulation, and long 
operating life. Low-threshold and high-differential quantum efficiency 
are desirable (necessary for some systems) in order to reduce the bias 
current and modulation needed from the drive circuitry. Since most 
lightwave transmitters need to operate over a range of temperature, a 
weak temperature dependence of the threshold of the laser is desira- 
ble—otherwise, thermoelectric controllers are required inside the 
transmitter package in order to stabilize the laser temperature. The 
conventional InGaAsP double heterostructure laser emitting at 1.55 
pm may be more sensitive to temperature and have lower differential 
quantum efficiency than a laser emitting at 1.3 um. The former is due 
to larger nonradiative Auger recombination rate’® and the latter may 
be due to larger intervalence band absorption®” at longer wavelengths. 
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The effect of these processes is discussed in detail in Sections II and 
III. We find that the effect of the larger Auger coefficient at 1.55 wm 
is compensated by smaller carrier density at threshold so that the 
nonradiative Auger current at 1.55 um is not significantly larger than 
that for 1.3 um. 

The fabrication of InGaAsP lasers emitting at 1.55 wm by Liquid 
Phase Epitaxy (LPE) usually requires the growth of an additional 
InGaAsP layer (antimeltback layer) in order to prevent the meltback 
of the active layer in subsequent growth of InP layer.!° 

The effect of the thickness of this antimeltback layer on threshold 
current and efficiency of 1.55-um lasers is discussed in Section IV. 
Although this layer need not be present when the double heterostruc- 
ture is grown by Vapor Phase Epitaxy (VPE), fabrication of Distrib- 
uted Feedback (DFB)-type single-frequency lasers usually requires an 
intermediate band gap layer (1.1 to 1.3 wm) between the active layer 
(1.55 wm) and InP layers. Antimeltback layers are not needed for 1.3- 
um InGaAsP double heterostructures grown by LPE, although DFB 
lasers emitting at 1.3 wm need an intermediate gap layer (~1.1 wm) 
between the active layer and the InP cladding layers for optimization. 

Real index-guided lasers are needed as sources for high-performance 
fiber communication systems because these lasers are less susceptible 
to light-current nonlinearities and intensity self-pulsations than gain- 
guided lasers.’ Many strongly index-guided laser structures utilize 
reverse biased junctions for current confinement. Leakage currents, 
that is, current flowing around the active region, may be responsible 
for high-threshold and light-current sublinearity in nonoptimized 
structures.’? Since the leakage currents in many cases varies as 
exp(AE,/kT), where AE, is the difference in band gap of the blocking 
layers (InP) and the active region (1.3- or 1.55-um InGaAsP), we 
expect the leakage currents in 1.55-~m InGaAsP lasers to be smaller 
than those in 1.3-um InGaAsP lasers. This is discussed in Section V. 

Experimental results from several types of real index-guided lasers 
emitting at 1.3 and 1.55 um are compared in Section VI. These results 
show that the 1.55-um lasers have somewhat higher threshold current 
(~30 percent at 30°C) and lower light output at 100 mA (~40 percent 
at 30°C). The former, in our opinion, is due principally to smaller 
mode confinement factor (because of the presence of antimeltback 
layer), and the latter is due to the combined effect of lower photon 
energy (~20 percent) and larger intervalence band absorption and free 
carrier absorption at 1.55 wm. The measured threshold current as a 
function of temperature for lasers emitting at both wavelengths can 
be represented by the expression J, ~ Jpexp(T/T>), where To is a 
parameter determining the temperature sensitivity. Similar Ty values 
are observed for lasers emitting at both wavelengths except in cases 
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where leakage current is believed to affect T>. The limitations in high 
bit-rate long-haul fiber communication systems introduced by the 
source linewidth are discussed in Section VII. 


Il. AUGER EFFECT 


Since the initial work by Beattie and Landsberg,’* it has been 
established that the band-to-band Auger processes are often a major 
nonradiative carrier loss mechanism in small band gap semiconduc- 
tors. We expect the nonradiative Auger rate for 1.55-um band gap 
material to be larger than that for 1.3-~m material. 

The Auger rate (R,) in undoped semiconductor varies approximately 
as 


R, = yn’, (1) 


where y is the Auger coefficient and n is the injected carrier density.’* 
The radiative recombination rate varies approximately as 


R, = Bn?, (2) 


where B is the radiative recombination coefficient.'*> The current 
density of a broad-area laser at lasing threshold (in the absence of 
other recombination mechanisms) is given by 


J = eR,d + eR,d 
=J,+d, (3) 


where e is the electron charge and d is the active layer thickness, ¢J,, 
J, are the radiative and the Auger component of the current, respec- 
tively. The Auger coefficient ~ in eq. (1) is dominated by phonon- 
assisted Auger processes for larger band gap semiconductors and by 
band-to-band processes for small band gap semiconductors. Detailed 
discussion of band-to-band, phonon-assisted, and trap Auger processes 
in direct gap semiconductors are given in Ref. 7. Figure 1 shows the 
calculated y for InGaAsP alloy lattice matched to InP for n = 10% 
cm™~*. Because of the uncertainty in the calculation of the absolute 
magnitude of the Auger coefficient, we have plotted y in Fig. 1 
normalized to its value (yo) for 1.3-um InGaAsP. The calculated yo 
~1 xX 10°°8 em® sec”. The measured values by several authors are 
shown in Table I. Figure 1 shows that the Auger coefficient for 1.55- 
um InGaAsP is about a factor of 4 larger than that for 1.3-um 
InGaAsP. Agrawal and Dutta”? have found that the Auger coefficient 
for 1.55-um InGaAsP is about a factor of 3 larger than that for 1.3- 
pm InGaAsP from an analysis of threshold current of stripe geometry 
InGaAsP lasers emitting at 1.3 and 1.55 um. Thus the Auger coefficient 
for 1.55-um InGaAsP is about 3 to 4 times larger than that for 1.38-4m 
InGaAsP. 
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Fig. 1—The relative Auger coefficient y/yo is plotted as a function of band gap of 
InGaAsP. 


Table I—Measured Auger coefficients 


(cm®sec™*) Reference Comment 
5 x 10°? (CCHS) Mozer et al. (1982)?@ A= 1.3 wm 
2.3+1%x 10° (Total) Sermage et al. (1983)?” \ = 1.3 ym, optically pumped 
1 X 10-” (Total) Henry et al. (1981) d\ = 1.3 wm, doping dependence 
of 7 
<3 x 107% (Total) Su et al. (1982) AJ=1.3 py 
3 x 10°% (Total) Thompson (1983)'8 Fit to data of Su et al. 
3-8 x 10°” (Total) Uji (1983)? \ = 1.3 ym LEDs 
2.8-1.5 x 10°*9 Wintner and Ippen (1984) = 1.3 um optically pumped 
7.5 X 10-9 Wintner and Ippen (1984) = 1.55 um optically pumped 


The radiative recombination rate B can be calculated by using the 
Gaussian Halperin-Lax band tails and Stern’s matrix elements.'** 
Figure 2 shows the calculated B/By for InGaAsP for n = 10'* cm™*. Bo 
is the value of B for 1.3-um InGaAsP. The calculated By = 1 x 107?° 
cm? sec™!. The measured values lie in the range 0.9-1.5 x 107° cm? 
sec /,2+,22 Rquations (1) and (2) are strictly valid only for nondegener- 
ate electron and hole gas. However, at high injected carrier densities 
(~2 < 10’8 cm™*) at laser threshold the electron and holes are degen- 
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Fig. 2—The relative radiative recombination coefficient B/Bo is plotted as a function 
of band gap of InGaAsP. 


erate. Degeneracy effects can be introduced by writing y, B as func- 
tions of n. Both y and B decrease slowly with increasing carrier 
density.7% 1415 

The laser threshold currents are determined principally by the 
magnitudes of y, B, and the threshold carrier density. The latter 
depends to some extent on the laser structure, for example, through 
the mode confinement factor and optical absorption. However, it is 
possible to determine the injected carrier density at transparency (no) 
from the band structure parameters alone.”? We consider undoped 
materials. Using the Joyce-Dixon approximation” for the quasi-Fermi 
levels and assuming parabolic bands, we show the calculated no for 
InGaAsP in Fig. 3. The threshold carrier density is usually 30 to 40 
percent higher than no. Note that no for 1.55-~m InGaAsP (in Fig. 3) 
is smaller than that for 1.3-um InGaAsP. This is due to smaller 
conduction band effective mass at long wavelengths. Since the radia- 
tive recombination current varies as Bn’, Figs. 2 and 3 suggest that 
the threshold current of 1.55-um InGaAsP lasers should be lower than 
that for 1.3-um InGaAsP lasers in the absence of Auger recombination. 
This is shown in Fig. 4c. 

Figure 4 shows the calculated radiative and Auger components of 
the current, plotted as a function of temperature, for 1.3- and 1.55-um 
InGaAsP-InP broad-area double heterostructure lasers. The radiative 
component is calculated using a constant (temperature-independent) 
absorption loss of 830 cm™ in the active layer as in Ref. 7. The Auger 
component of the total current is calculated using eq. (1) for Auger 
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Fig. 3—The injected carrier density at transparency for undoped InGaAsP. 


recombination rate, the temperature dependence of threshold carrier 
density (mn) from Figs. 17 and 23 of Ref. 7, the calculated temperature 
dependence of y, and yo = 3 X 10-2 cm® sec™ (which falls within the 
range of the measured values but is smaller than the calculated value’ 
of 1 x 10-73 cm® sec’). 

Auger rate calculations using the Kane band model” result in Auger 
coefficients higher than observed experimentally.”® Recently, Haug”® 
has calculated the Auger coefficients in 1.3-um InGaAsP using the 
band model of Chelikowsky and Cohen?’ for InP but with an energy 
gap assumed to be that of y = 1.8-um InGaAsP. He finds that the 
phonon-assisted Auger processes (CCCHP’) are dominant with a value 
of ~2.5 x 107”? cm® sec™!. Using a calculated temperature dependence 
of the phonon-assisted Auger process, this would result in a Ty value 
of 100 to 110K for 1.3-um InGaAsP-InP lasers, which is higher than 
the observed value in broad-area lasers (~60 to 70K). There is suffi- 
cient uncertainty in the band structure parameters and the carrier 
concentration at threshold that it is unreasonable to expect better 
agreement between calculation and experiment. Figure 5 shows the 
predicted 7) values (between 300 and 350K) plotted as a function of 
Auger coefficient (at 300K) for two different values of carrier density 
at threshold. The quantity k is the ratio of the Auger coefficients at 
350 and 300K. The smaller value is the calculated result for the 
phonon-assisted Auger processes, and the higher value is that for the 
band-to-band processes. Although the temperature dependence of y 
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Fig.4—The calculated radiative and Auger component of the current for lasers 
emitting at 1.3 and 1.55 um. (a) Radiative. (b) Auger component. (c) Total. 


has not been experimentally determined, the calculated temperature 
dependence of mn, agrees well with that of the measured threshold 
current using short pulses.” The total threshold current which is the 
sum of the radiative and Auger component is shown in Fig. 4a. The 
smaller Jp for 1.55-um lasers is approximately compensated by the 
larger J,. The above calculation is for an InGaAsP active layer with 
InP cladding layers both for 1.3- and 1.55-um lasers. The presence of 
an antimeltback layer reduces the confinement factor, which increases 
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the threshold current of 1.55-y~m lasers over that shown in Fig. 4. This 
is discussed in detail in Section IV. Furthermore, heterobarrier leakage 
due to drift and diffusion may be responsible for increased threshold 
current of 1.3-um InGaAsP DH lasers if the p-cladding layer doping 
is too low.”**° These lasers also exhibit a higher temperature depend- 
ence of threshold (lower To) than is commonly observed. Such het- 
erobarrier leakage is smaller in 1.55-um InGaAsP lasers because of 
larger barrier height. The high-energy electrons generated in the Auger 
process may also escape (thermionic emission) over the heterobarrier, 
in which case the carrier leakage is expected to be independent of the 
barrier height and p-cladding doping. Shah et al.*! have reported 
observation of hot carriers, and Yamakoshi et al.** have reported 
observation of carrier leakage in 1.3-um InGaAsP-InP lasers. However, 
Henry et al.** did not observe hot carriers in their experiments. 


Ill. INTERVALENCE BAND ABSORPTION 


Intervalence band absorption is an optical loss mechanism in the 
active region of InGaAsP lasers that can reduce the external differ- 
ential quantum efficiency.®? The external differential quantum effi- 
ciency (ng) is given approximately by 


Am 





(4) 


Me say 


where 
a = Ta, + (1 — Nya, 


where a,, is the mirror loss, I is the confinement factor, and a, and 
a, are the absorption and cladding layer losses, respectively. For a 250- 
um long laser the “equivalent distributed” loss a,, = 40 cm™ using 
R = 0.35. Figure 6 shows na plotted as a function of a, for several 
values of I’. We assume a, = 30 cm™’. The light output L at a current 
AI above threshold is given by L = ngE, AI, where E, is the band gap 
of the active layer. The light output from 1.55-ym lasers for a given 
AI and ng is lower than that for 1.3-~m lasers because the former has 
lower photon energy. . 

Sugimura® has calculated the intervalence band absorption in III-V 
semiconductors using Kane band model. The absorption is larger for 
small band gap semiconductors and increases with increasing temper- 
ature. The calculated absorption (a) normalized to its value (ao) for 
1.3-um InGaAsP at 300K is shown in Fig. 7. The calculated ag is 
approximately 30 cm™?. Henry et al.** have extrapolated the interva- 
lence band absorption in 1.55-um InGaAsP from measurements in 
InGaAs. Adams et al. have proposed that the high temperature 
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Fig.6—The calculated external differential quantum efficiency as a function of 
optical loss in the active. 


sensitivity of threshold (low To) current of 1.6-um InGaAsP lasers is 
due to intervalence band absorption. Although this is not the dominant 
mechanism responsible for low To of 1.55-um InGaAsP lasers, it may 
be why the observed ng of InGaAsP lasers emitting at 1.55 um is lower 
than that for lasers emitting at 1.3 um. 


IV. ANTIMELTBACK THICKNESS 

Both 1.3- and 1.55-~m InGaAsP double heterostructures have been 
grown by LPE and VPE techniques. However, as mentioned previ- 
ously, for LPE growth it is necessary to grow a short-wavelength (~1.1 
to 1.3-um) InGaAsP layer over the 1.55-um active layer to prevent 
meltback of the active layer during the subsequent growth of InP 
layer. A thick antimeltback layer reduces the confinement factor of 
the guided mode and hence increases the laser threshold. However, a 
smaller confinement factor reduces the effect of intervalence band 
absorption and hence increases the differential quantum efficiency. 
We now calculate the effect of antimeltback layer thickness on device 
threshold. 

The threshold gain gi, is given by 


Ign = Ta, + (1 — Tac, (5) 
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Fig. 7—The calculated intervalence band absorption from Ref. 8 normalized to its 
value for 1.3 um. 


where I’, a,, and a, are the confinement factor and the absorption in 
the active and cladding layers, respectively. We assume a, = 30 cm™ 
and a, = 40 cm™ and 150 cm“ for the curves in Fig. 8. The confine- 
ment factor is calculated using the analysis of Butler® for a four-layer 
guide. The refractive index of InP, 1.3-um InGaAsP, and the 1.55-um — 
active layer are 3.21, 3.4, and 3.55, respectively. For the calculated gi, 
in eq. (5), the threshold carrier density (m,) is obtained from previous 
calculations. Then the threshold current density at 300K is obtained 
using eq. (3) with B = 0.9 x 10-*° cm™ sec”! and y = 9 x 10°”? cm® 
sec”', Figure 8 shows the results of the calculation for both Jy,and 7 
as a function of antimeltback layer thickness. Thus, both the threshold 
current and the external differential quantum efficiency increase as 
the antimeltback layer thickness is increased. The former is due to 
larger threshold gain caused by reduced confinement factor as the 
antimeltback layer thickness is increased, and the latter is due to 
lower loss experienced by the lasing mode in the active layer (in the 
above mode) due to smaller I’. The observed smaller ng of the 1.55-um 
lasers may also be due to larger free carrier absorption both in the 
cladding and active layer at longer wavelengths. 
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Fig. 8—The (a) threshold current and (b) external differential quantum efficiency as a function of the thickness of the 
antimeltback layer. 


V. LEAKAGE CURRENT 


Real index-guided lasers are needed for high bit-rate fiber transmis- 
sion systems because of their superior performance over gain-guided 
lasers.’ Many strongly index-guided lasers utilize reverse biased junc- 
tions for current confinement to the active region.” It is generally 
believed that leakage currents, that is, currents flowing around the 
active region, are responsible for high threshold current, light-current 
sublinearity, and poor high-temperature performance of nonoptimized 
laser structures. We have previously analyzed the leakage currents in 
several 1.3-um InGaAsP-InP laser structures using electrical equiva- 
lent circuit models.’” The leakage current through leakage paths with 
pin junctions varies approximately as exp(AE,/kT), where AE, is the 
band gap difference between InP and the active layer. AE, ~ 0.39 eV 
and ~0.55 eV for InGaAsP lasers emitting at 1.3 and 1.55 wm, respec- 
tively. Thus the magnitude of leakage current is expected to be 
approximately ~exp(0.16 eV/kT) ~ e® times smaller for InGaAsP 
lasers emitting at 1.55 um than for lasers emitting at 1.3 um. In many 
laser structures the leakage current flows through forward-biased pin 
InP homojunctions.” A fraction of this current in 1.3-um InGaAsP 
lasers can be detected as radiative emission at ~0.95 um, which is the 
band gap of InP.*® No emission at ~0.95 um is observed from 1.5-um 
InGaAsP lasers, which suggests that leakage currents through InP 
homojunctions are significantly smaller in these lasers. These consid- 
erations do not apply to linear resistive shunt paths. 


VI. EXPERIMENTAL RESULTS 


In this section, we compare the experimental results from several 
types of real index-guided InGaAsP lasers emitting at 1.3 and 1.55 
um. We first discuss the strongly index-guided lasers. Over the last 
few years, we have fabricated the Channeled Substrate Buried Het- 
erostructure (CSBH),®’ the Etched Mesa Buried Heterostructure 
(EMBH)***? and the Double-Channel Planar Buried Heterostructure 
(DCPBH)* lasers (see Fig. 9).*' The CSBH laser has a nonplanar 
active region and can be fabricated using one LPE growth. The EMBH 
and DCPBH lasers have planar active regions, which make them 
compatible with the fabrication of DFB and DBR-type single-fre- 
quency lasers.*” These structures need two epitaxial growth steps for 
fabrication. Schematic cross sections of these laser structures are 
shown in Fig. 9. 


6.1 Double channel planar buried heterostructure lasers 


The schematic cross section of this device structure is shown in Fig. 
9c. The fabrication of DCPBH lasers involves two epitaxial growth 
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Fig. 9—The schematic cross sections of the (a) EMBH, (b) CSBH, and (c) DCPBH 
lasers are shown. 


steps.*° Briefly, a planar double heterostructure is grown by LPE over 
a n-InP substrate followed by the double channel etching and then 
regrowth. The CW light-current (L-I) characteristics of a DCPBH 
laser emitting at 1.3 wm is shown in Fig. 10. Note that this particular 
laser operates CW up to 130°C. These lasers typically have pulsed 
threshold currents in the range 15 to 25 mA at 30°C, and external 
differential quantum efficiencies in the range 0.2 to 0.25 mW/mA/ 
facet. The variation of threshold current (J,,) with temperature (7) is 
given by the commonly used expression [,(7T') ~ Io exp(T/To), with 
T> values in the range 80 to 100K. The higher Jy values of many 
DCPBH lasers when compared with broad area lasers (Ty ~ 60 to 
70K) could be due to a small (~10 mA) temperature-independent 
leakage current. This is supported by our observation that lasers with 
low threshold (<15 mA at 30°C) have smaller Ty values than lasers 
with somewhat higher threshold (~25 mA at 30°C). 

Figure 11 shows the L-J characteristics of a DCPBH laser emitting 
at 1.55 um. The doping levels and layer thicknesses of the lasers 
emitting at 1.55 and 1.3 um are similar except that the 1.55-y~m double 
heterostructure has an antimeltback layer (~0.15 um thick) of 1.3-um 
InGaAsP. 
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Fig. 10—Light-current characteristics of a DCPBH laser emitting at 1.3 um. 
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Fig. 11—Light-current characteristics of a DCPBH laser emitting at 1.55 pm. 
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Table II—Performance characteristics of typical DCPBH lasers 


; \=1.3 ym A=1.5 pm 
Ty, (30°C) 15-25 mA 20-35 mA 
L(100 mA) 10-16 mW 6-9 mW 
Efficiency/facet 0.2-0.25 mW/mA 0.12-0.15 mW/mA 
To 90-100K 55-65K 
Tmax (CW) 130°C 110°C 
Fabrication requirements 
Active area (kinks) <0.3 pm? <0.2 pm? 
Antimeltback layer thickness — 0.1-0.15 um 


(Zen) 


The performance characteristics of our typical DCPBH lasers emit- 
ting at 1.3 and 1.5 wm are compared in Table II. The parameters 
shown are threshold current, external differential quantum efficiency, 
light output at 100 mA, 7, and the maximum CW operating temper- 
ature (Tynax). The performance of these lasers at both wavelengths are 
acceptable for many lightwave system applications. Nevertheless, we 
are interested in determining to what extent 1.3-ym lasers are intrins- 
ically superior to 1.55-um devices and to what extent the present 
differences represent relatively easy-to-overcome technological differ- 
ences. The lowest threshold current observed for 1.3- and 1.55-um 
lasers are 12 and 15 mA at 30°C. The maximum CW operating 
temperature of these lasers are 130 and 110°C, respectively. 

The threshold current of the 1.5-ym lasers are slightly higher (~30 
percent) than that for the 1.3-um lasers. This is due to the smaller 
confinement factor of the waveguide mode of the lasers emitting at 
1.5 wm caused by the antimeltback layer. This additional layer is not 
required for double heterostructures grown by vapor phase epitaxy, 
and hence in that case the threshold current may be lower. 

The external differential quantum efficiency per facet of the lasers 
emitting at 1.55 um is lower than that for lasers emitting at 1.3 wm. 
Using ng = 0.25 mW/mA for 1.3-yum lasers and 0.15 mW/mA for 1.5- 
pm lasers, the optical absorption of the guided mode using eq. (4) is 
37 and 67 cm™ for the lasers emitting at 1.3 and 1.5 wm, respectively. 
In deriving the above, we have used a mirror loss of 40 cm™, which 
corresponds to a mode reflectivity of 0.35 for our 2.50-um long lasers. 
The optical loss of the mode is related to the cladding and active layer 
loss by the following expression: 


a = Ta, + (1 —T)ac. (6) 


Using a, = 30 cm” for both wavelengths and a calculated [ = 0.47 
and 0.38 for the 1.3- and 1.5-um laser structure, we get a, = 44 and 
127 cm™ for the active layer losses of the 1.3- and 1.55-um laser, 
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respectively, at 30°C. These estimates of the absorption loss are an 
upper limit because eq. (4) neglects the effect of leakage currents, 
which can reduce the efficiency. The external differential quantum 
efficiency can be increased by decreasing. the length of the laser. For 
1.55-um lasers, a smaller confinement factor can also increase the 
efficiency, as discussed in Section IV. 

The light output at 100 mA is also shown in Table II because it can 
be a measure of the combined effect of threshold current and external 
differential quantum efficiency and also provide an estimate of the 
drive current needed when the laser is used in lightwave transmitters. 
Note that this quantity is smaller for lasers emitting at 1.5 um, because 
of the combined effect of lower efficiency and lower photon energy of 
these devices. Shorter lasers should have higher light output at 100 
mA than that shown in Table II. However, the reliability of short 
lasers may be a problem because of higher threshold current (and 
carrier) density. 

The observed temperature dependence of threshold current of lasers 
emitting at 1.3 wm is lower than that for lasers emitting at 1.5 wm. 
This is represented by higher 7» value of the 1.3-um lasers. We believe 
that a temperature-independent leakage path (leakage current ~10 
mA at 30°C) can be responsible for high T> (~90 to 100K) of 1.3-u4m 
lasers because low threshold (Ji, ~ 15 mA at 30°C) 1.3-um lasers have 
lower Ty (~75K). High JT values caused by leakage current have been 
observed previously in EMBH lasers emitting at 1.3 um.*? The smaller 
leakage current in the DCPBH structure enables 1.3-um DCPBH 
lasers to operate at temperatures as high as 130°C. 

Spatial hole burning causes transverse mode transitions with in- 
creasing injection in strongly index-guided lasers.** These mode tran- 
sitions appear as “kinks” in the L-J characteristics and are undesirable 
for lightwave applications. Spatial hole burning can be reduced and 
the L-I kinks eliminated by reducing the active area of the laser to 
less than ~0.3 ym? for lasers emitting at 1.3 wm** and ~0.2 um? for 
lasers emitting at 1.55 um. This fabrication requirement for kink-free 
operation of 1.55-ym lasers is more difficult to achieve. 


6.2 Etched mesa buried heterostructure laser 


The light-output-current characteristics of a EMBH laser emitting 
at 1.3 wm is shown in Fig. 12. These lasers typically have threshold 
current in the range 15 to 30 mA at 30°C and external differential 
quantum efficiency in the range 0.2 to 0.25 mW/mA. The lowest 
threshold current observed for 1.3-~m EMBH lasers is 11 mA at 30°C 
and the maximum CW operating temperature is 100°C. The variation 
of threshold current (/,) with temperature (7°) is given by To values 
in the range 60 to 75K. These lasers with optimized layer thicknesses 
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Fig. 12—Light-current characteristics of a EMBH laser emitting at 1.3 um. 


and doping levels do not exhibit a sublinearity in the light-current 
characteristics for power less than ~15 mW/facet near room temper- 
ature. At high injection, thyristor-like leakage path causes the L-J 
characteristics to roll over.!”*° The fabrication requirements on active 
layer dimensions for low-threshold and kink-free operation of these 
lasers are shown in Table II. 


6.3 Channeled substrate buried heterostructure lasers 


We now discuss the experimental results for the CSBH laser (which 
has a nonplanar active layer). The schematic cross section of this 
device structure is shown in Fig. 9b. The light-current (L-I) charac- 
teristics of a CSBH laser emitting at 1.3 um is shown in Fig. 13. The 
CSBH lasers are fabricated by LPE growth of an n-InP layer, 1.3-um 
InGaAsP active layer, p-InP layer, and InGaAs contact layer over a 
base structure that has V grooves etched in it. The base structure has 
a p-InP current blocking layer, which may be LPE grown*’ or VPE 
grown“ over an n-InP substrate. Cd-diffusion can also be used to form 
the blocking layer.*’ Alternatively, Fe implantation® or Fe-doped high- 
resistivity InP layers*® can be used to limit the current flow to the 
active region in the V groove. All of the above schemes for base 
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Fig. 13—Light-current characteristics of a CSBH laser emitting at 1.3 um. 


structures of CSBH lasers yield devices with comparable threshold 
current and efficiency. The threshold current of CSBH lasers emitting 
at 1.3 um is typically in the range of 15 to 30 mA, and external 
differential quantum efficiency is 0.17 to 0.20 mW/mA/facet. Kink- 
free operation has been achieved to >24 mW/facet in lasers with active 
area less than 0.3 pm?°, 

CSBH lasers emitting at 1.55 um have been fabricated using Cd- 
diffused base structure and Fe-doped InP—grown by Metal Organic 
Chemical Vapor Deposition (MOCVD)—base structure. Figure 14 
shows the L-I characteristics of a CSBH laser emitting at 1.55 um 
with a Cd-diffused base structure. The CSBH lasers emitting at 1.55 
‘ pm have lower quantum efficiency than lasers emitting at 1.3 um. The 
threshold current, quantum efficiency, and Ty values of CSBH lasers 
emitting at 1.55 um fabricated using Cd-diffused and MOCVD base 
structures are similar. 

Performances of typical CSBH lasers emitting at 1.3 and 1.55 um 
are compared in Table III. The parameters shown are threshold 
current, efficiency, light output at 100 mA, 77>, and maximum CW 
operating temperature. The data are representative of results from 
several wafers of each type. The observed lowest threshold currents 
for CSBH lasers emitting at 1.3 and 1.55 um are 12 and 18 mA at 
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Ty ~ 58K 
Ng ~ 0.12 mW/mA 
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Fig. 14—Light-current characteristics of a CSBH laser emitting at 1.55 um. 


Table !!!]—Performance characteristic of typical 
CSBH lasers 
A=1.3 pm A\=1.5 pm 
Tn, (30°C) 15-25 mA 25-35 mA 
Efficiency/facet 0.17-0.2 mW/mA 0.11-0.14 mW/mA 
L(100 mA) -13 mW 6-8 mW 
To 50-55K 50K 
Tmax 90°C 90°C 


30°C. The maximum CW operating temperatures for the 1.3- and 1.55- 
pm lasers are 120 and 90°C, respectively. 


Vi. DISCUSSION 


The information-carrying capacity of a digital link is given by the 
product of the bit rate (B) and the distance (L) between the trans- 
mitter and the receiver. It is a common practice to characterize a 
transmission system by its bit-rate-distance product, although this 
may not be the judge of the overall performance.” 

The loss limited transmission distance (L) is determined by the 
minimum number of photons per bit needed by a receiver to detect it. 
It is given by 


LASERS 1877 


ols STATE-OF-THE-ART (LOSS LIMIT) 


= 
[=] 
oO 


D = 1 ps/nmkm 


REPEATER SPACING IN KILOMETERS 


oO 





1. 100 1000 10,000 
BIT RATE IN MEGABITS PER SECOND 


Fig. 15—Bit rate versus repeater spacing for lightwave systems operating near 1.3 
pm. 
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Fig. 16—Bit rate versus repeater spacing for lightwave systems operating near 1.55 
pm. 


Pr = Pr exp(—aL) 
or 
10 Pr 
L=—1 cea 
ms O810 Pp 
where a is the fiber loss in dB/km and P7, Pz is the transmitted and 
received power. For loss limit Pr varies linearly with the bit rate B. 
The solid lines in Figs. 15 and 16 show the loss limit for 1.3- and 1.55- 
um systems using Py = 0 dBm (1 mW). 
At high bit-rate fiber dispersion becomes an important limitation 
because of pulse spreading. The effect of pulse spreading on the 
performance of lightwave systems have been calculated by several 


(7) 
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authors.°”°? The dispersion limit is usually given by”! (within a factor 
of 2) 


BL < 1/(ADo), (8) 


where D is the fiber dispersion and oa is the source linewidth. Com- 
mercial silica fibers have zero dispersion near 1.3 um and D ~ 15 ps/ 
nmkm at 1.55 um. Using o ~ 3 nm for 1.55-~m multimode sources, eq. 
(8) suggests that in spite of the high loss limit the bit-rate-distance 
product is limited to 5.5 Gb-km for 1.55-um transmission systems 
unless single-frequency sources are used. 

Several schemes have been used to obtain single-frequency emission 
at 1.55 um. Cleaved-coupled-cavity,’ external cavity, and DFB- 
type*” single-frequency lasers have been fabricated using the DCPBH 
laser structure shown in Fig. 9c. Although all of these schemes produce 
essentially single-frequency sources (linewidth <20 MHz) under CW 
excitation, the linewidth under pulsed current modulation is signifi- 
cantly larger (1 to 2A).°°*§ This phenomenon is usually called fre- 
quency chirping and it arises from a modulation of the refractive index 
by the injected current that modulates the effective cavity length. The 
chirp limit using eq. (8) for three different chirp widths is shown by 
the dashed curves in Fig. 16. External modulators must be used to 
eliminate the frequency chirping. 

Dispersion limitations can also be important for 1.3-~4m systems at 
high bit rates. In most optical fibers, the dispersion D is greater than 
1 ps/nmkm at wavelength separation AX > 10 nm from the zero 
dispersion point. Using D = 1 ps/nmkm, we get the dashed lines in 
Fig. 15 as the dispersion limit for multimode source linewidths of 15, 
30, and 50A, respectively. The longitudinal mode spacing of a 2.50-um 
long 1.3-um InGaAsP laser is ~9A. This suggests that the 30A line in 
Fig. 9 may be a practical limit for 1.3-~4m systems using multimode 
sources. Furthermore, it is well known that the emission spectrum of 
a 1.3-um InGaAsP laser under microwave modulation (>2.5 Gb) is 
considerably broader (because of the appearance of many longitudinal 
modes) than for low bit-rate (<1 Gb) modulation.”® This broadening 
is due to band filling, which becomes significant for modulation 
frequencies larger than inverse carrier recombination times. Thus, for 
high bit-rate-distance operation near 1.3 um it is necessary to use 
single-frequency sources unless the emission spectrum under modu- 
lation is within +10 nm of the zero dispersion wavelength of the fiber. 


VIH. CONCLUSION AND SUMMARY 


We have compared the performance of real index-guided InGaAsP 
lasers emitting at 1.3 and 1.55 um. The 1.3-um lasers have somewhat 
lower threshold current (~20 percent) than 1.55-um lasers. This is 
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principally due to the smaller confinement factor of the 1.55-ym lasers 
due to the presence of the antimeltback layer in our LPE grown lasers. 
The VPE growth technique does not require the presence of antimelt- 
belt layer; thus essentially similar threshold current can be realized at 
both 1.3 and 1.55 wm. The measured median threshold current at 30°C 
from several wafers of our CSBH and DCPBH lasers are 15 to 25 and 
20 to 35 mA for lasers emitting at 1.3 and 1.55 um, respectively. The 
lowest threshold current observed at 30°C are 11 and 15 mA for lasers 
emitting at 1.3 and 1.55 um, respectively. 

The temperature dependence of the threshold current is given by 
the commonly used expression I,(7) ~ Jo exp( T/T) with Ty) ~60-75K 
for 1.3-um lasers and 55- to 65K for 1.55-um lasers. A small temper- 
ature-independent leakage current (~10 mA) is believed to be respon- 
sible for high T) values (~100K) of some 1.3-um DCPBH lasers. The 
nonradiative Auger recombination process that increases with decreas- 
ing bandgap is believed to be responsible for the low To values of the 
long-wavelength (both 1.3 and 1.55 wm) InGaAsP lasers. The effect of 
the larger Auger coefficient at 1.55 um is compensated by lower carrier 
density at threshold at 1.55 wm so that the total nonradiative current 
loss for lasers emitting at 1.55 um is not significantly larger than that 
for lasers emitting at 1.3 um. This results in similar To values for 1.3- 
and 1.55-ym lasers. 

The measured efficiency of the 1.55-ym lasers is smaller (~20 
percent) than that for 1.3-um lasers. This may be due to the combined 
effect of larger intervalence band absorption and free carrier absorp- 
tion at longer wavelengths. The smaller efficiency combined with 
lower photon energy (~20 percent) at 1.55 wm than at 1.33 um makes 
the light output at a given operating current (~100 mA) of the 1.55- 
um laser lower by ~35 percent than that for 1.3-um lasers. This shows 
that the transmitter circuitry will need higher drive current when 
1.55-um lasers are used. The higher operating current may have some 
reliability implications. The external differential quantum efficiency 
for 1.55-um lasers can be increased by fabricating short lasers. How- 
ever, the reliability of short lasers may be a problem because of higher 
threshold current density in these devices. 

Capacitance associated with leakage junctions is believed to influ- 
ence the high frequency modulation capability of index-guided lasers 
that use reverse biased junctions for current confinement. We have 
fabricated 1.3- and 1.55-um lasers that can be modulated at high bit 
rates (>2 Gb/s). The laser linewidth under modulation is found to be 
significantly broader than that under CW operation. The phenomenon 
is called frequency chirping and arises from the modulation of the 
carrier density that modulates the refractive index of the guided wave. 
The measured chirp width of the 1.55-ym laser is larger than that for 
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the 1.3-y~m laser, because of the larger variation of refractive index 
with carrier density at longer wavelengths. The thickness of the 
antimeltback layer in our 1.55-um lasers determines to some extent 
the measured chirp width; for example, lasers with larger antimeltback 
layer thickness have smaller confinement factor and hence have less 
chirp. If the laser is modulated at bit rates higher than the relaxation 
oscillation frequency, the chirp width is significantly enhanced. Since 
relaxation oscillations are damped in strongly index-guided lasers we 
expect these lasers to have less chirping at high bit rates (>1 Gb/s) 
than weakly index-guided lasers. The external cavity-type single- 
frequency lasers have been fabricated from our multimode lasers.® 
These single-frequency lasers exhibit less chirp (~2) than the multi- 
mode lasers due to frequency pulling effect.®' Such frequency pulling 
effects can reduce the chirp width of distributed feedback type of lasers 
also. A chirp of 1A limits the bit-rate distance product to 160 Gb-km 
for 1.55-um transmission systems using conventional silica fibers. 
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For terrestrial digital radio systems that use Quadrature Amplitude Modu- 
lation, the idea of adapting equalizers to multipath distortion, without relying 
on accurate data estimates, is attractive. Prompt adaptation following a severe 
fade, when accurate data estimates are unavailable, is useful for reducing 
outage time. To avoid processing and administrative overhead, the adaptation 
method should not involve violating the transmitted signal with the insertion 
of equalizer training signals. We approach this kind of equalization by building 
on an algorithm of D. Godard (IEEE Transactions on Communications, 
November 1980)! that was devised for voiceband polling networks. The method 
involves a very simple tap update procedure. However, the technique lacks the 
foundation of the years of analysis and experimentation that underlie least- 
mean-square adaptation algorithms. The main purpose of this paper is to 
present new findings, including (1) a proof that the algorithm, thought to 
require special equalizer initialization, converges regardless of initialization 
(this offers useful flexibility in digital radio systems, since, after a severe fade, 
the algorithm could start with any tap misalignment); (2) a preliminary look 
at convergence speed suggesting the possibility of significant outage reduction; 
(3) an algorithm that provides phase coherence (the original algorithm requires 
a follow-on phase-locked loop); and (4) an algorithm for cross-polarization 
cancellation as well as equalization. 


l. INTRODUCTION 
1.1 The problem of prompt data detection 


Consider a Quadrature Amplitude Modulation (QAM) digital radio 
signal (or a dually polarized QAM pair) propagating through a medium 
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subject to slowly, randomly varying, frequency selective fades (and 
cross-polarization coupling). On certain occasions the loss of signal 
can be so complete that a theoretically optimum receiver could not 
detect the data. Subsequently, a strong signal returns, but carrier and 
timing may have wandered and the medium may have significantly 
changed its dispersive character. The objective is to detect the data 
symbols forthwith as the signal strength returns. 

It is the uncertainty about the various features of the received signal, 
apart from the inherent uncertainty associated with the information 
symbols (and additive noise), that slows the recapture process. Carrier 
frequency and phase, and timing frequency and phase are all to some 
degree uncertain. Moreover, the 2 X 2 matrix transfer characteristic 
of the dispersive medium (the diagonal elements describe the co- 
polarization transfer characteristics and the off-diagonal terms express 
the couplings between polarizations) is also uncertain. This medium 
must be equalized to enable accurate data detection. During the 
bootstrapping, reliable data estimates are unavailable. Consequently, 
data-directed Minimum-Mean-Square Error (MMSE) equalization is 
not feasible. 

In this paper a method of equalization is analyzed that does not 
require the availability of data estimates. The method involves tap 
adjustments based on simple computations using samples of the re- 
ceived QAM signal and of the equalizer output. For simplicity, in the 
following sections equalization (along with cross-polarization cancel- 
lation) is considered in isolation assuming carrier and timing recovery 
have somehow been accommodated. We stress that equalization is one 
part of the bootstrapping process. At the time of this writing, carrier 
recovery is a topic of research. One promising approach employs a 
quartic nonlinearity. Later we will say more about carrier recovery. 
Regarding timing frequency, we anticipate that the squared-envelope 
method is adequate in most applications. Practical realizations of the 
systems we analyze are assumed to employ a sufficient number of 
fractionally spaced taps to be quite robust to timing phase. 


1.2 Are probing tones not needed? 


The approach to equalization that we treat leaves the standard form 
of the transmitted signal inviolate. (In contrast, one could monitor the 
medium with real-time measurements and adjust equalizers on the 
basis of the measurements.) Is the standard (no modification at the 
transmitter) signal already a media probing signal? Is it also a control 
signal that arranges for equalizers and cross-polarization cancellors to 
automatically align in response to reasonable real-time, digital signal 
processing? A practical affirmative answer would enable one to avoid 
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the processing and administrative overhead associated with altering 
the form of the transmitted signal. 


71.3 Outline of results 


In this paper we take an approach to equalization, without altering 
or detecting data, that was originally designed for expediting start-up 
in equalizers in voice band polling networks. The method, originated 
by D. Godard,” involves a very simple tap update procedure and for 
that reason is especially attractive. [See Ref. 3 for an earlier paper 
providing a method for PAM (but not QAM) signals. Related research 
has been conducted.**] The algorithm assumes the average squared 
constellation vector is zero and is presented in Section IV. First some 
background is needed. In Section II the basic model for single polari- 
zation transmission is presented. Section III discusses a general view 
of the tap evolution that will be used. 

Godard’s algorithm lacks the foundation of the years of analysis and 
experimentation that underlie least-mean-square adaptation algo- 
rithms. The main purpose of this paper is to present new findings on 
the mathematical theory of this little-understood algorithm. 

The algorithm was thought to require special tap initialization to 
converge. We show that the algorithm converges regardless of initial- 
ization (Section V). This flexibility is significant given the vagaries of 
tap misalignments that could be associated with severe channel fading. 

In Section VI we take a preliminary look at the subject of conver- 
gence speed. This is accomplished by reinterpreting published numer- 
ical work! (aimed at voiceband systems) for digital radio applications. 
While a mathematical analysis of the transient behavior seems intract- 
able, Section VI explains that considerable insight can be obtained 
from the analysis of a much simpler related problem. 

Two new algorithms related to the Godard algorithm are given. The 
first (Section VII) provides phase coherence as part of the equalization 
process for hypothetical systems employing highly stable oscillators. 
The original algorithm required a follow-on phase-locked loop. Section 
VIII presents an algorithm that provides for cross-polarization can- 
cellation as well as equalization. 

Throughout much of this paper the mathematical theory idealizes 
assuming an infinite tap equalizer. This is because it is very awkward 
to deal analytically with the finite tap case. One is left wondering if 
there is any pitfall associated with the finite tap algorithm. The current 
status of this issue is addressed in Section IX. There is some evidence 
that the infinite tap—finite tap contrast is nearly as tame as it is with 
MMSE equalization. 

A second purpose of this paper is to answer the question of whether 
the potential for using the aforementioned algorithm in digital radio 
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Fig. 1—Simplified baseband model. 


systems is significant enough to warrant detailed study. An affirmative 
answer is given. 

A discussion is given in Section X. The Appendix contains infor- 
mation on QAM constellation moments. 


Il. THE MODEL 


Since our primary goal is the presentation of theoretical results, a 
simple setting is used. We work with the equivalent baseband model 
shown in Fig. 1. The complex data sequence is denoted a. The elements 
of a = (--- do, a1, --+) represent independent identically distributed 
choices from a QAM constellation, each point of which is equally 
likely. We normalize so that E|a,|? = 1. 

The complex sequence h represents samples of the impulse response 
of the transmitter and medium combination. The sequence c repre- 
sents the complex equalizer taps. Using the « symbol for convolution, 
the sampled impulse response of the channel and equalizer in combi- 
nation is denoted s = h « c, the received data is denoted y = a « h, and 
the sequence after the equalizer is z = axh*c. This notation is 
consistent with the notation of Ref. 1. 

Also, h is assumed to have a continuous Fourier transform devoid 
of spectral nulls. Consequently, h has a convolution inverse h™’ satis- 
fying h x h"! = 010. By 0 we mean an infinite sequence of zeroes, left 
directed if preceding a number and right directed if following a number. 
If 0 is written without abutting a number, it means the sequence of 
zeroes extending from — to +0, 

A more refined model of the terrestrial digital radio environment 
would include additive white Guassian noise at the input to the 
receiver. However, the major interest in this paper is in prompt re- 
establishment of adequate equalization after a cataclysmic event dur- 
ing which the data-detection capability was completely lost (so P. > 
1/2). The situation is that the medium, despite the presence of additive 
noise, has the potential of providing adequate performance if only the 
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equalizer could be properly aligned. In such situations the s/n is 
generally so large that optimal MMSE equalizer including noise effects 
is only slightly better than the inverse equalizer,’ which neglects noise. 
Once the equalization can provide for a P, in the neighborhood of 0.1, 
conventional linear MMSE equalization is an option. To be prudent, 
unless stated otherwise, this option is assumed. So neglecting the noise 
is not a substantive shortcoming of the analysis. MMSE Decision 
Feedback (DF) equalization is an alternative option. The bane of DF 
has been the possibility of the detection process entering a disasterous 
error propagation mode. With Godard’s algorithm as a fallback, the 
error propagation is no longer disasterous. The theoretical perfor- 
mance of DF is superior to linear equalization.®’° Therefore, in a 
severe fade, DF helps in forestalling the failure of the decision-directed 
mode. However, in some important digital radio applications the 
theoretical advantage of DF over linear is marginal.’ 


Ill, VECTOR FIELD FOR EQUALIZER TAP EVOLUTION 


Generally a tap update is a random vector. The subject of “conver- 
gence” of a tap update procedure prompts several interesting questions. 
What is the underlying trend in the tap evolution? If the tap setting 
is tending to some target region, how. long does it take to get there? 
What is the long-term equilibrium distribution of the tap settings? 
Recasting the last question less mathematically: Do the tap settings 
significantly stray from the target region once they get there? These 
are difficult questions. In this paper we will primarily attack the first 
question, say a little about the second, and defer the third. The target 
region can be defined as constituting those settings for which MMSE 
can commence. We proposed above that MMSE be used when possible. 
Consequently, for the third question we defer to MMSE theory. 

To address the first question, we shall deal with the mathematical 
abstraction of vector fields in tap space. That is, at each point in tap 
space there is a vector, that, when added to the current vector of tap 
settings, points to where the taps should nominally be set in the 
immediate future. Anticipating that time spans of interest in tap 
bootstrapping processes involve over 10* symbols, it is sometimes 
useful to approximate and represent tap evolution in continuous rather 
than discrete time. The vectors are smooth functions of the taps. The 
relative magnitudes of the vectors relate to the relative speed of change 
of the taps. 

The fields of interest to us are conservative, that is, they are derived 
by taking the gradient of a potential function ¢. The gradient depends 
on the current tap setting and the channel-impulse response. We 
stress that in implementations the channel response is not known and 
the tap changes (gradients) are derived from readily accessible random 
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variables whose mean values are the desired quantities. (This is the 
so-called stochastic gradient approach. Conventional MMSE utilizes 
a stochastic gradient approach requiring accurate data estimates.) 
The potential functions we will work with are fourth-degree poly- 
nomials in the tap gains and their complex conjugates. Exceptional 
points, where the field vector is a zero vector, are called stationary 
points. For potential functions such as fourth-degree polynomials the 


Hessian matrix 
ao 
a (22) 


types the stationary points. (We are using a bar here for complex 
conjugation.) This typing stems from a power-series expansion about 
a stationary point and is as follows: 





positive semidefinite < local minimum 
negative semidefinite <> local maximum 
indefinite < unstable equilibrium. 


In the sequel, when we say that an algorithm “converges” to some set, 
we mean that the set constitutes the points of stable equilibrium of 
the corresponding vector field. 

The stochastic gradients of |z,|* and |z,|? will be used later. We 
record them for reference for even positive powers Q. We obtain 
(similar to Ref. 1) 

Q 
Mel = yey vay? = 5 [(y'eN FANUC. 
Cr OC, Z 
The prime denotes transpose and the bar denotes conjugation. Since 
(Q/2) — 1 is a nonnegative integer the computation of the stochastic 
gradient of |z,|° involves only multiplications of readily accessible 
quantities. These quantities are the tap settings and the vector of tap 
outputs. We note that the taps evolve so that the vector c is a function 
of time and the notation has suppressed that dependence thus far. 


IV. DESCRIPTION OF GODARD’S QUARTIC ALGORITHM 


In this section we review the algorithm of Ref. 1 for updating 
equalizer taps, for which, remarkably, data estimates are not required. 
The algorithm, if properly initialized, is known to converge to a vector 
of the form h+c = (---0, e, 0, 0, ---), which is the ideal Nyquist 
response, except for the presence of an arbitrary phase 6. In other 
‘words, the algorithm converges so that z is the same as a except that 
the constellation needs to be rotated into position. The four-fold 
ambiguity associated with positioning the constellation is not a prob- 
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lem since the data is assumed to be differentially encoded at the 
transmitter. 

Let cin; denote the tap vector at time n. The tap update procedure 
is based on a gradient minimization of 


b(Cn)) = E(l2n|? — R)?, (1) 


where R = E|a,|*/E|a,|* = E|a,|* is a constant depending only on 
the signal constellation. Thus the tap update procedure is 


Cinet] = Cin) — AVnZn(l2n|? — PR), (2) 


where 2 is the step size. 
One way to motivate eq. (1) is to consider 


$ = E(l2n|? — |an|?)? (3) 
as an error criterion. Overlook, for a moment, that estimates of | a, |? 
are not available at the receiver immediately after a severe fade. ¢ has 
some nice features. After all it measures zero when z, = a,. The 
systems we are dealing with are nominally linear, so that it seems 
reasonable to speculate that when ¢ is zero, then the system is ideally 
equalized modulo a rotation of the constellation. We must replace 
| a, |? in (3) by a more reasonable quantity. We replace it by a constant 
chosen to make the two expectations in (1) and (3) have substantial 
agreement when expressed in more fundamental form in terms of h 
and c. (See Ref. 1 for details.) This completes the interpretation of the 
choice of R. 

Recalling that s = h « c, the potential ¢(c) is expressed as 


plc) = 2 (Y |se?)” — (2 - Elan|*) Y I sel 
— 2E|an|* ¥ | sel? + (El an|*)*. (4) 
Let L? be the number of constellation points and recall the normali- 


zation E|a,|? = 1. Regarding s = h «c as the independent variable, 
we have 


b(s) = 2 (Y |se[*)? — ete | X | sel* 


7L? — 13 ef UP S138 
-2(Z=¥) you +(e ¥), (5) 


The equality (4), comes from straightforward harmonic analysis (it 
appears in Ref. 1) and equality (5) uses the Appendix. 

In the following sections new results are presented on the theory of 
quartic algorithms. 
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V. CONVERGENCE OF THE QUARTIC ALGORITHM 


In the previous section we mentioned that the algorithm converges 
if it is properly initialized. Now we will show that such initialization 
is not needed for convergence. In a certain sense the algorithm will 
converge regardless of the initialization. More precisely, we show that 
the only loci of stability of the vector field in tap space are the family 
of sets 


E, = {c|c * h = e9(01,0), 6 an arbitrary real} 


Each E;, is an ellipse in tap space since it is expressible as a linear 
transformation of a circle. By 1, we mean that 1 occurs in the kth 
position. The set E A U®.. E, are the points of global minima. These 
points are the ideal Nyquist responses (modulo a phase adjustment of 
the constellation). 

A gradient search can only terminate on one of these circles of local 
minima. There are no spurious local minima. The only other stationary 
points are c = 0, which is a local maximum corresponding to the shut- 
down of the receiver and some points of unstable equilibrium “saddle 
points.” We now substantiate these claims. The mathematical ap- 
proach used in the remainder of this section is similar to that used in 
Sections VII and VIII. However, the demonstration here is much 
simpler. 


5.1 Stationary points relative to overall system response 


To demonstrate the character of the stationary points, we first 
discuss the stationary behavior relative to s, from which the nature of 
the stationary points relative to c will follow. 

With respect to the conjugate coordinates §,, we take the partial 
derivative of ¢ and equate to zero to get 


8 _ ye), Sty |. 12 
{2=4(Z ist) 9 B (E21 |! se 
7L? — 13 ‘4 
— »( 8) mot 


So s = 0 is a solution. Dividing through by s;, for s, ¥ 0 gives a very 
simple equation for the stationary points. In general, the stationary 
points are the vectors having the property that there are a finite 
number M = 0 of nonzero coordinates all M of which are (7L? — 
13)[10M (L? — 1) — 3(L? + 1)]? in squared modulous. These stationary 
points are typed as 


M=0: s=0,a local maximum 
M = 1: global minima 
M = 2: unstable equilibria. 
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The reasoning behind this typing of each case will now be given. Keep 
in mind the expression of ¢(s) given in eq. (4). 


The case M = 0. 


This corresponds to s = 0, which is a local maximum. Simply note 
that sufficiently small perturbations of 0 serve to reduce the value of 
¢(s). The reason for this reduction is that, for a very small perturba- 
tion, the quartic effect is negligible relative to the quadratic one. 


The case M = 1. 


These points are loci of global minima. The global minimum cannot 
be attained if more than one component of s is nonzero. Indeed, if § 
has more than one nonzero component then § 4 0, (¥ | $;|?)!, 0 gives 
o(§) < (Ss). Say that the ith component of s is the only nonzero 
component. The function ¢(s) is then minimized when | s;|? = 1. 


The case M => 2. 


These points are unstable equilibria (“saddles”). The instability for a 
stationary point s with M = 2 is shown by first decreasing ¢(s) by a 
perturbation of two nonzero components of s that leaves ¥ |s;|? 
invariant. Secondly, ¢(s) is increased by a perturbation that simply 
increases the magnitude of a zero component by a sufficiently small 
positive number. 


5.2 Stationary points relative to tap weights 


The results of Section 5.1 are only of incidental value since we are 
interested in the vector field relative to c not relative to s. However, 
the results thus far can be interpreted to provide what we need, as we 
now explain. 

The operation of convolution of c with h represents an invertible 
continuous function on tap space. Since ¢(s) is a continuous real 
function of s, ¢(s) = ¢(h*c) is also a continuous real function of c. 
Recall the very elementary fact that notions like local maximum, local 
minimum, and unstable equilibrium are defined as neighborhood prop- 
erties in tap space. It is a property of continuity that the mappings h« 
and h™'+ leave invariant the entire system of neighborhoods. It is 
immediate from the results of Section 5.1 that the points in E are the 
only points of stability in tap space when the taps evolve in accordance 
with the specified vector field. 

For those uncomfortable with the above argument we give another 
level of detail. Say $ = h+*¢ is a point of local minimum of ¢. This 
means that there is an open set in tap space containing Ss, where ¢(S) 
is the least number achieved. By convolution of elements of this open 
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set with h, one creates an open set about ¢ on which ¢(h « ¢) is the 
least number achieved. Local maxima and unstable equilibria are 
handled similarly. 


VI. A PRELIMINARY LOOK AT CONVERGENCE SPEED 
6.1 Interpretation of a published simulation 


To consider quantitatively the subject of convergence speed, we fix 
on a hypothetical example. Namely, we focus on a system with a 
transmission speed measured in tens of megabauds. Outage time 
accumulates when P, = 107°, and is assumed to be limited to 150 
seconds per year per hop (in a nominal system). The hypothetical 
channel is assumed to fade in accordance with the Rummler model.** 

Let P’(t) denote the probability of bit error that an ideally equalized 
system can provide at time ¢t. Then P’(t) = P(t), where P(t) is the 
probability of bit error that is actually achieved at time ¢. For the 
purpose of discussion, assume that the clear air s/n is such that if P(t) 
were identically equal to P’(t), the outage objective would be roundly 
met. Suppose a fade is so severe that decision direction of the equali- 
zation process becomes impossible. When the fade subsides to the 
point where P’ < 107°, the objective is to boot (or be booting) the 
equalizer so that P < 10~° occurs with a negligible time lag. An 
aggressive booting procedure, operating with an uncertain frequency- 
selective transfer characteristic, is viewed as a key element in achieving 
substantial outage reduction. The alternative of waiting for the dis- 
persion to clear to the extent that a crude equalizer will open the eye 
is manifestly unacceptable. 

The analysis of the booting process is difficult for two reasons. First, 
the extremal statistics of fade time dynamics are not well established. 
Second, even if such a model were available, it would be difficult to 
mathematically represent the time dynamics of an equalizer based on 
the Godard algorithm. However, if we assume that the time interval 
during which P’ > 10™° lasts for a few seconds, then an equalizer 
booting time measured in tens of milliseconds would be negligible in 
terms of contributing to outage time. Indeed, even a 100-millisecond 
boot time would be negligible if the preponderance of the time occurs 
before the level P’ = 107° is down-crossed. Assuming that the channel 
transfer characteristic does not change appreciably during booting, we 
take a preliminary look at whether the quartic algorithm can boot an 
equalizer in tens of milliseconds. 

Reference 2 reports simulations of transient responses from a cold 
start. The examples are for a voiceband application; however, they 
can be interpreted for a digital radio context. One of the examples is 
particularly interesting. (See Fig. 7d in Ref. 1.) Although 64 QAM is 
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not treated, rectangular constellations with 16 and 32 points are 
considered. The effect on the transient response of increasing from 16 
to 32 is negligible. (This is expected because the constellation moments 
for L? = 16 points are already nearing their L — © asymptotes, as 
shown in the Appendix.) The channel used to generate the transient 
responses has as severe a dispersion as can be expected in the digital 
radio application. This statement is based on a rough indicator of 
dispersion, namely max,,| H(w) |?/min,,| H(w) |”. (H() is the channel 
transfer characteristic.) The unimodal | H(w) |? had a higher indicator 
of dispersion than the most extreme of the 25,000 fades in a compre- 
hensive library’ generated from the Rummler model.” 

The transient responses of Ref. 2 suggest that booting times of less 
than 10° symbols may be possible. In the digital radio application at 
tens of megabauds, 10° symbols are received in a few milliseconds. 

A transient analysis of the algorithm is certainly an ambitious 
undertaking. However, it is possible to conduct a mathematical anal- 
ysis of eq. (2) for a hypothetical, real, one-dimensional case. We sketch 
this analysis in the following subsection. By projecting the convergence 
behavior of interest into a simple, understandable context, useful 
insight is gained. 


6.2 Analysis of a related one-dimensional evolution 


In the much simpler domain of least-mean-square adaptation algo- 
rithms,’” the transient behavior for the one-dimensional case is ana- 
lyzed. Then a heuristic argument is made to extend the transient 
analysis to the higher-dimensional case. In the one-dimensional anal- 
ysis of the Godard algorithm that follows, we will see how the corre- 
sponding MSE behavior of the algorithm compares with an MMSE 
evolution. The MMSE evolution assumes known data at the receiver. 

In this one-dimensional case, both h and ¢ are real scalars but the 
data is complex. In this subsection, a coordinate index is unnecessary 
so we write c; for cj). The gradient algorithm is 


0 
Ci = C1 — A— (J2i|? — | RI)’. (6) 
0c; 


For expositional simplicity in the sequel, we work with only the 
asymptotic form (L — ©) of the constellation moments. (Section V 
served to demonstrate that dealing with finite L is not an essential 
complication. The Appendix shows the moment asymptotes are rapidly 
approached.) Compute for later use 


ulc;) & Elcia — e|e;) = —5.6Ah[(he;)? — he; (7a) 
a*(c;)" A E[(ci+1 = c:)? | ci] 


= )7h?[68.3(he;)® — 104(he;)* + 44.2(he;)7]. (7b) 
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Note that both yw and o are linear in Ah. 
Were it not for the stochastic aspect, the evolution would be de- 
scribed by the nonlinear difference equation 


Cis = C; — 5.6Ah((he;)? — he;). (8) 


A crude representation of the evolution (8) with small uses a 

deterministic differential equation. A more refined representation of 

(6) uses a stochastic differential equation. We look at both of these. 
The deterministic differential equation is 


< = —5.6dh[(he)* — (he)], (9a) 


in the time scale where one unit equals one symbol time. The time, T, 
that it takes for c to evolve from c, ¥ 0 to a target setting c; is easily 








seen to be 
= TEE a (5) a 
The corresponding formulae for the MMSE algorithm are 
< = —2\h(he — 1) (9c) 
T= =—5 In (7=*2), (9d) 


The formula (9d) is consistent with the statement in Ref. 12 that the 
time constant for the MMSE stochastic-gradient algorithm is (2\h?)'. 

If one uses values in the right-hand side of (9b) and (9d) that are 
reflective of the digital radio application, the logarithmic term can be 
shown to be of little consequence in assessing order-of-magnitude 
effects. The latitude in being able to set c; so that hc; is only approxi- 
mately 1 is crucial to taming the singularity. Reference 12 on MMSE 
explains that, in higher dimensions, h? is replaced by (trace R/N), 
where RF is the channel autocorrelation matrix and N is the number of 
equalizer taps. In data communication applications, a normalized form 
of the channel is often appropriate to account for AGC. In a one- 
dimensional context, AGC gives hc = 1, trivializing the equalization. 

Refining (9a) to bring in the stochastic element of the evolution, we 
have 


dc = p(c)dt + o(c)dp(t), (10) 


where (d8)/(dt) is a standard white-noise process. It is difficult to give 
a complete analysis of this equation; however, some results are possi- 
ble. 
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A mathematical “experiment” was made to test the restoring action 
of the dynamic represented by (10). The mathematical tools in Ref. 
13 were used. Set the tap c at a very large value c, and consider the 
transit time to another large value y. Assume c, >> y >> c;. Under the 
deterministic evolution, (9a), the expected time until y is hit for the 
first time is proportional to y~*. With the stochastic dynamic, the 
expected hitting time is proportional to y~*. 

As the tap setting c nears the target region, the first-order term in 
the power series for y(c) and o(c) is linear and constant, respectively. 
This limiting evolution is that of the elastically bound particle (also 
called Orstein-Uhlenbeck"*). A complete transient analysis of the 
dynamics of the elastically bound particle is classical.'* We have 


E[he(t) — 1] (0) = eo] = (he, — 1)e72?**t (11a) 


Var[he(t) — 1| (0) = co] = 0.381Ah2(1 — e724"), (11b) 


We are assuming the quartic algorithm is only for bootstrapping, so 
0.381\h? does not correspond to the equilibrium variance. 

It is interesting to contrast with MMSE, for which the expected 
time from start to target is available in closed form using the method 
of Ref. 13. We get 











5 
(? - He) 
l <<? ee 
[a a A a ee 1 — he; 
~ \h? + 1.47h4 ence 2° \1— he, 
T\h? 
1 1- he; 
ae In (? = *e (A small). 
Also, 
E(he(t) — 1] c(0) = co) = (he, — 1)e7?™"** for each t, (11c) 
and 


Var(he(t) — 1] c(0) = c) ~ (he, — 1)2e7!2"* ast—>o. (11d) 


Noise and quantization effects, which are not included, will predomi- 
nate for large t and thus serve to bound the MMSE variance above 
zero. 

Interestingly, for both the MMSE and quartic algorithm, the expo- 
nential decay that is observed in the simple deterministic analysis is 
maintained when stochastic effects are included. Evolution under the 
quartic potential compares favorably with that of the quadratic. How- 
ever, the apparent advantage of the quartic of a factor of 5.6 in time 
constant is illusory, as we next discuss. 
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(SYSTEM IS 
TURNED OFF) 


QUARTIC 
POTENTIAL 


ch = 1STABLE 1.4 (ch)4 — 2.8 (ch)? + 1.96 
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(SYSTEM ISS 
EQUALIZED) 


POTENTIAL FUNCTION 


QUADRATIC 
POTENTIAL 
(\ch| — 1)2 





ch PRODUCT 


Fig. 2—Potential functions for simple one-dimensional case. 


Relation (8a) suggests that one should make } large to reduce the 
transit time. However, \ must be set carefully. The differential equa- 
tion loses its effectiveness as an approximation to (8) once X gets too 
large. An analysis of (8) points out that, for \ sufficiently large, 


| Ci+1 | > Jc; I, 


which is a disastrous instability. The wall of the potential well for 
Godard’s algorithm is quartic while, for MMSH, it is quadratic. (See 
Fig. 2 to contrast the quartic and quadratic potentials.) Consequently, 
the quartic algorithm requires a smaller to avoid instability than 
MMSE requires. Indeed, in the simulations of Ref. 2 it was noticed 
that a smaller value of \ was needed. As c; approaches the target 
region, one would like \ small to encourage stepping into rather than 
over the target region. A key area of future study is to determine a 
dynamic procedure for prudent setting of 2. 


6.3 A heuristic discussion of transient behavior 


The results of the previous subsection require refinement to obtain 
a more global description of the behavior of the trajectories of (6). 
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More importantly, the vector algorithm needs to be analyzed and that 
appears to be formidable. These are items for future research. At 
present we are limited to drawing on what we have developed in 
Sections V and 6.2.1 to express our current intuition of the qualitative 
nature of tap evolution: Far from the target region, there are saddle 
points representing loci where more than one target region is equally 
accessible. The evolution is not stymied as the strong random effect 
forces a choice. Still far from the target region, there is a very strong 
trend toward the target. The motion slows as the target is approached 
and becomes that of a particle elastically bound to the elliptical bull’s- 
eye corresponding to the Nyquist responses. Close to the target, the 
radial motion is not essentially different from an Orstein-Uhlenbeck 
evolution, except in one respect. The evolution has a tangential indiffer- 
ence that is of no consequence since phase coherence is left for an 
auxiliary process. The 0 tap setting is an unstable equilibrium that is to 
be avoided (and that is not difficult). To the extent that the vector tap 
does not begin too close to zero, the transit time to target would seem to 
be dominated by the Orstein-Uhlenbeck-like motion. 


VII. REMOVING PHASE AMBIGUITY 


A useful feature of the quartic algorithm that we have been discuss- 
ing is that equalizer convergence does not need carrier recovery. As 
explained in Ref. 2, the tracking of carrier phase can be carried out 
using a decision-directed phase-locked loop that will converge once 
equalization has taken place. For digital radio applications, this feature 
implies a significant immunity of the equalization process to frequency 
offset and phase jitter. 

On the other hand, if a digital radio system were designed with 
highly stable oscillators, could we employ a quartic algorithm providing 
for coherent recovery without a phase-locked loop? After all, standard, 
decision-directed MMSE equalization provides, upon convergence, for 
coherent recovery of the signal constellation. This result holds, in 
principle, under the idealized assumption that the channel is linear 
and time invariant. In practice, depending on the degree of phase jitter 
and frequency offset, a phase-locked loop may or may not be required. 
In this section, we seek to provide an analogous result for nondecision- 
directed equalization. Specifically, we demonstrate the existence of a 
potential function that provides for phase recovery as part of equali- 
zation. Of course, a QAM constellation is left invariant by 90 degree 
rotations, so the more precise meaning of the term “phase recovery” 
is phase recovery modulo 90 degrees. 

For expositional simplicity, we develop the potential function for 
the asymptote of a QAM constellation with infinitely many points. 
The approach to defining corresponding potentials for finite constel- 
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lations is the same as for the asymptotic case. The limiting form, as 
L — ©, of the potential function given in equation (5) is 


2 
o(s) = 2 (3 \*) — 0.6 py | s;|* — 2.8 »Y | s;|? + 1.96. 


We will show how to modify ¢(s), but first, for a backdrop, we discuss 
a much simpler situation in which phase recovery is obtained. 


7.1 Orienting a rotated but otherwise perfect constellation 


We have seen that the stochastic gradients with respect to c of 
E|z,|* and E|z,|? are readily accessible. From (1) and the ensuing 
discussion, we have that for appropriate choice of A and B, tap settings 
evolving in accordance with the gradient field of AE | z,|* + BE|z,|? 
converge to an optimal equalizer up to a rotation. Obviously, Re Ez 
also has a readily accessible stochastic gradient. Moreover, Re Ez‘ 
may be useful for phase recovery, as we next indicate. 

It is easy to see that Ea? is a negative number. Let a/, = e*a,, where 
6 is an arbitrary phase displacement that does not depend on n. It 
follows that 


nly) = E Re(e*a;,)* 


is minimized when yy = —6@(mod 7/2). These are the only minima, 
whereas y + 6 = 7/4 (mod z/2) are the only maxima. Consequently, a 
tap rotating according to the gradient field of n(y) comes to a stable 
state when the constellation is correctly oriented. 

Based on what we have discussed thus far, one might suspect the 
existence of a potential of the form E(A’|z,|* + B’|z,|? + C’ 
Re z4) whose only points of stability are ideal Nyquist responses with 
recovered carrier. Such potentials exist, as we now show. 


7.2 Equalization with orientation 


In what follows, positive parameters v and wu are introduced in the 
potential function, ¢(s), as follows: 


o(s) = 2 (¥ [sel?)? — 2.8 ¥ [sn]? — uw E |se]* — 2v Re D sh. 
Later we will see that »v and uw can be set to get the tap evolution 
desired. Namely, we can obtain an evolution whose only points of 
stability are of the form 0¢0, (e4 = 1). 

The gradient of ¢(s) with respect to the conjugate coordinates is 
Vos, = 4d |s|%s; — 2.88; — 2u| 5; |%s; — 405?. (12) 


The solutions are the stationary points. The Hessian matrix is denoted 
G= (H;) (where Hi = (0°) /(08;0s;)). We have 
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Hy = 4D |sel? + 4[5:|? —2.8 -—4u]s:|? for i=j 
k 


= 45s; for 1¥j. (13) 


As in Section 5.2, 0 is easily seen to be a local maximum. Expressing 
s; as | s;|e’”, eq. (12) for the non-null stationary points becomes 


—4y|s;|?e-# — Qu|s;)|? — 28+ 4 > | sel? = 0 


4 b} | sp |? —- 2.8 
i 2 k 
1 = 
or [sil Ave¥% + Qu ” 
From the nonnegativity of | s;|, we conclude that 
ei = +1, 
But e~*% = —1 cannot be associated with a local minimum. Just look 
at (14) and notice e~”% = e*#% = —1 implies Re >, s$ > 0 so the 
mapping s > --- $;-1, sie’* , Siti «++ reduces the value of ¢. The local 
minima of ¢ must have e/*% = +1. These minima satisfy 
4¥ |s,|? — 2.8 
2 = 14 
Isil 4v + 2u (14a) 


Each minimum has all nonzero coordinates of equal modulus, all equal 
to the right-hand side of (14). Say there are M nonzero coordinates; 
then 


1.4 


Be a et 
Isi| 2M —-2v—-p 


To ascertain whether the stationary points are local minima, or points 
of instability (saddle points), we need to determine the nature of the 
Hessian at the stationary points. To effectively deal with the Hessian, 
it is mathematically convenient to permute coordinates so that the 
coordinates 1 through M are the ones for which (14a) holds. Some 
notation is also needed. 0,,, represents a matrix with p rows and q 
columns in which each element is zero. J, denotes a p X p identity 
matrix. The Hessian is 


9 8 One Oo.m 00,00 
G= SMa. a Ome (2y — plu Ome + 28s’. (15) 
BS EE NOs. “Ossie 2v + p)Iee 


Notice “is expressed as the sum of a diagonal matrix and a dyad. 
The nonzero elements of s in eq. (15) are all +1 or +j. The special 
structure of &allows the spectrum of &to be easily found. The 
nonzero eigenvalues are 
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1(2 + 2v — ree 
C2 ee) of multiplicity one 


2M — 2u — 2 

0.7(2v — p) Os ake 
a ray 1 
OM — Qu — 2 of multiplicity M 

0.7(u + 2p) : . Se 
—— ] ; 
OM — Qu — Qv of infinite multiplicity 


It remains now to choose yp and »v so that the only stable equilibria are 
of the form 00, where ¢* = 1. 

For |s;| = 1 when M = 1 choose p + 2v = 0.6. From the spectrum 
of eigenvalues it follows that if we have 


2-pt2v>0-—pt+2v<0, 


then M = 1 gives a positive semidefinite Hessian and M > 1 gives an 
indefinite Hessian. For example, » = 0.45 and v = 0.075 satisfies all 
requirements. So the only stable equilibria of 4(s) with » = 0.45 and vp 
= 0.075 are of the form 0<0 with ¢* = 1. 

Again, as in Section 5.2.1, we have described the stationary behavior 
of the gradient field with respect to s and not with respect to c. The 
argument extends to tap space in the same manner as in Section 5.2.2. 


VIN. ALGORITHM FOR CROSS-POLARIZATION CANCELLATION AS 
WELL AS EQUALIZATION 

In this section we develop the theory for a cross-polarization can- 
cellation algorithm. We will establish that a 2 X 2 matrix equalizer 
will converge so that both receiver outputs are free of Intersymbol 
Interference (ISI) and cross-polarization interference. There is a pos- 
sibility that, despite the perfect decoupling, one or both polarizations 
may be transposed. The taps evolve in accordance with the gradient 
of a vector potential that will be introduced shortly. Upon convergence, 
phase needs to be recovered by a pair of phase-locked loops. This 
transposition ambiguity is easily resolved in practice. For example, 
the polarizations may be “tagged” by the scrambling process. When 
necessary, the procedure can be reinitialized to attempt to avoid 
locking onto the undesired polarization. 

Some notation needs to be introduced. We need a two-dimensional 
setting to account for horizontally and vertically polarized signals. 
Here c and h are 2 X 2 matrices. The vectors (zy, zy) and (a, b) are 
related as follows: 


ZH\ _ [C1 C12 . hin hi2 : a 
2vV C21 C22 her hye b}* 
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The individual elements of zy, zy, a and b are denoted by subscripts. 
We use s to denote the matrix 


( * hy + Cre * hey Cy * Aig + Cho * fa) 


cx#h= 
Coy * Ay + Coo * Ng, Cog * Ney + Coo * hoe 


The matrix h is assumed to be nonsingular so that h7' exists 


As, [OAD 0 
(:, oS (? 0, lo, 3) 


The components of the vector (a, 6) represent the QAM data sequence 
driving the horizontal and vertical polarizations. Of course, the ele- 
ments of a and 0 are all independent. 

We employ a vector criterion; specifically we want 


min E(|zyn|? — E|an|*)? 
(¢11,¢12) 


min E(|zyn|? — E|bnl*)? 


(c21,€92) 


The advisability of this vector criterion will become apparent. Notice 
optimization of these two components proceed independently of each 
other in that the first component involves c,; and ¢,2 while the second 
involves Co; and Coo. 

We next show how to express these two expectations in terms of 
the components of s (denoted s,;(k)i, 7 = 1, 2) and the moments of a, 
and b,,. By symmetry it will be enough to make this demonstration for 
the first expectation. Again we normalize E | a, |? = E|b,|? = 1. Since 


2Hn >= x (s11(R)an—k + Si2(R)bn-x), 
we obtain 


E|zi|* = E 13 x DD (S11(R)Qn-n + $12(R)On-%) (Fir (YGr-1 


+ Sy2(1)bn-1) x (S11(P)@n—p + $12(D)On-p) (511 (q)Gn—q + slab 


Consider the product inside the quadruple sum. Of the 16 terms 
only the six involving @,-,@,-1Gn—pOn—q) An—kGn—lbn—pOn—q, An-kOn—1- 
Dip Oncags bn-2On—1bn—pOn-q; bp—-kOn-1An—pGn—q and Dich Oni On give 
nonzero expectations. This simplification follows by recalling that a 
and b are independent and Ea/ = Eb’ = 0 for j = 1, 2, 3. Concerning 
the six terms, we note that the last three terms become the same as 
the first three if we transpose a and b. Moreover, the second and third 
terms are exactly the same since the only apparent differences are in 
the labelings of indices that are being summed. So we need to ascertain 
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only the first two of the six expectations and symmetry will give the 
rest. The two sums are already available. Namely, 
YYYD EHsulk)an-weFi1 (1)@n—1$11 (D)An—p $11 (q) Gn-q) 


ko ol p q 
2 
= (Ela,|* — 2) » |si1(k) |* + 2 (s Ion) (16) 


and 


x x DY DY Elsi (R)Qn—2 511 (L)Gn-1812 (PD) bn—pFi2(q) bn-q) 
pq 


= (3 | $11 (R) \(y | Si2(F) °) . (17) 


Using the aforementioned symmetry gives, for the six terms compris- 
ing E | zy, |*, four copies of (17), and in addition to (16), its counterpart 
with s}, and sj. interchanged. To compute E(|zyn|? — E|a,|*)*, we 
also need E| zy, |? = E| a, |2(|511(R) |? + X] si2(R) |7). At this point 
we can substitute all the terms making up E(|zxn|? — E|a,|*)? and 
rearrange to get 


E(|zun|* — Elan |*)? = 2(¥ | siu(k) |? + D | si2(k) 1)? 
+ (E|a,|* — 2)(S | su(k) |* + | s12(k) 14) 
—2E | an|*(D | si(k) |? + DI sie(k) |?) + (Elan |*)?. (18) 


We introduce a new sequence S(k) obtained by alternating s,;(R) 
and s1.(k) elements. Thus S(0) = s:;(0), S(1) = s12(0), S(2) = s1:(1), 
S(3) = sie(1), --- and S(—1) = s12(—1), S(—2) = su(—1), S(-2) = 
$19(—2), S(—3) = s1,(—2)--- . Then (18) becomes the same as (5) with 
S replacing s. The invertibility of h enables us to achieve the optimum 
value. 

The stationary behavior of (18) now follows immediately from the 
results in Section 5.1. The minimizing c,, and ¢y2 satisfy 


Ci * hay + Cio * hey a 0 is 0e0 
& * Ayo + Cie * .) 7 (525) or ( 0 ) 

We are now in a predicament analogous to that at the end of Section 
5.1. As the results stand, they apply to s, not to c. A very straightfor- 
ward vector matrix extension of the argument in Section 5.1 gives the 
results desired for c. 


IX. FINITE NUMBER OF TAPS 


Thus far, the theory has been idealized in the sense that infinitely 
many taps were assumed. Can we be sure that, by using a large number 
of taps, an equalizer will behave in essentially the same way as an 
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infinite tap equalizer? Unlike MMSE theory, the theory of the quartic 
algorithm with finitely many taps is awkward to treat analytically. 
Partial, positive results are available and are conveyed here. 

The next subsection enhances the quartic algorithm to alleviate one 
of the difficulties arising with a finite equalizer. The following subsec- 
tion mentions very special related examples, where, with a finite 
number of taps, the convergence theory is complete and satisfactory. 
The next subsection reviews the status of the finite tap issue. 

The related topic of quantifying the number of taps needed in digital 
radio applications seems best addressed by computer-aided analysis 
(as with MMSE equalization’). We emphasize that, in the digital 
radio application, we are not aiming to equalize pathological H(w) with 
severe in-band nulls (and consequently an unreasonable number of 
taps to approximate h™'). Rather, as indicated in Section VI, the 
interest is in equalizing when H(w) can support a P{ of the order of 
10~*. For implementations, one would expect to use fractionally spaced 
rather than synchronously spaced taps. A numerical example is re- 
ported in Section 9.4. 


9.1 Centering the tap weight distribution 


The feature that the algorithm has no preference as to which tap 
should be the reference tap implies that, with finitely many taps, the 
tap weight distribution could crowd to one end of the equalizer. To 
avoid a lopsided tap-weight distribution one could periodically (say 
every few hundred symbols) have computed the center of gravity of 
the tap weights and then shift the weights to situate the balance point 
as close as possible to the center tap. When the quartic algorithm is 
in use, outage time is registering, so no additional outage is caused by 
shifting. The centering helps approximate the effect of infinitely many 
taps that the theory has required. A computationally simpler alterna- 
tive to the center of gravity method is to periodically compare the 
weights of the first and last tap, and then shift tap weights by one in 
the direction of the least weight. 


9.2 Special-case convergence 


Equalization when accurate estimates of the data are not available 
has been discussed for a very different algorithm in Ref. 5. That paper 
analyzes the case where the channel is perfect but the equalizer is 
misaligned. It is shown that, for certain tap initializations, convergence 
to an undesirable setting is possible. For the Godard algorithm, there 
is no problem when the channel is perfect and the equalizer is mis- 
aligned (arbitrarily). To show that there is no difficulty, we can make 
use of the analysis in Section V. In this case, s; = c;. The finiteness of 
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the number of taps does not alter the argument. Convergence to the 
optimum tap setting is guaranteed. 

For the second example, in the potential function of (4), interpret s 
(and h and c) as a discrete Fourier transform. Then it is not difficult 
to show that, within this modular context, convergence of c occurs. 
This is an artificial construct. However, a variation of this example of 
a finite tap equalizer may have some utility. Prospective application 
is not within the scope of equalization procedures of the kind that 
leave the transmitted signal inviolate. Rather, the application is as- 
sociated with the equalizer booting methods of the kind that use 
intervals of periodic training sequences. (The DFT is the key to 
modeling such equalizers.’®) The motivation for using quartic criteria 
such as in eq. (3) is that an immunity to frequency offset is anticipated. 
This type of equalization, which uses both a quartic potential and 
training sequence, has arisen in concurrent research by A. A. M. Saleh 
and the author. 


9.3 Status 


The issue of the effect of limiting the number of taps is, at bottom, 
the question of whether with sufficiently many taps and with centering 
as in Section 9.1 the evolution of the quartic error departs negligibly 
from the infinite tap case. Godard’ does not discuss the issue of limiting 
the number of taps. 

For a given application, one could circumscribe a universal ensemble 
of desired h™ and then choose the number of taps large enough so the 
approximation error is uniformly negligible, i.e., so that the omitted 
taps are essentially zero anyway. At each point in time, the taps that 
can evolve do so in such a way that the infinitesimal tap evolution is 
the same as the unlimited tap case. The taps that are omitted are 
essentially preset at their optimum values (~zero). The random com- 
ponent of the evolution serves to mask the effect of any perturbations 
of the infinite tap vector field that is caused by limiting the number 
of taps. 

The material presented so far supports that, with a sufficiently large 
number of taps (and employing a weight centering procedure), the 
convergence properties of the Godard algorithm differ negligibly from 
what is predicted for the infinite tap equalizer. However cogent the 
supporting evidence, a mathematically rigorous argument has not been 
given. The understanding of the finite tap issue is refined by a 
numerical example that concludes this section. 


9.4 ISI] versus number of taps 


With the aid of the computer one can take a deeper look at finite 
tap behavior. A computer program has been written to solve the 
deterministic equation of tap evolution 
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Cint1) = Cin) — AVG(Cin})- 


One can track the MSE from start, MSE,,), to equilibrium, MSE,..). 
The potential function, ¢, could be MMSE (with known data) or the 
quartic (with unknown data). The number of levels and the timing 
epoch are potential function parameters. For numerical work, some of 
the idealizing assumptions were dropped and we generalized to incor- 
porate important practical effects. Namely, the equalizer can have 
fractionally spaced taps and the transmitted pulse can be raised cosine 
with arbitrary roll-off factor. 

It is beyond the scope of this paper to include a comprehensive 
numerical study. We do, however, report the result of one interesting 
computer experiment. Figure 3(a) expresses MSE.) versus the number 
of taps for some anecdotal cases of multipath fade. The fade charac- 
teristics are in accordance with the Rummler model” for multipath 
with midband notches in the range 16 to 22 dB. The Rummler phase 
parameter is zero and the scale parameter is inconsequential, as the 
affect of additive noise is neglected. See Fig. 3b. The roll-off is 25 
percent, and the fractional spacing is half that of synchronous equal- 
ization. The timing epoch is optimized. The curves were generated for 
L = © (as we are primarily interested in large constellations and the 
characteristics are insensitive to changes in L for L large). Computa- 
tions were made for 3, 5, 7, and 9 taps. The curves shown are solid 
merely for interpolating to the even ordinates; of course there is no 
meaning to a noninteger number of taps. 

The original intent in generating the characteristics in Fig. 3 was to 
compare MSE...) for the quartic and quadratic potentials. However, 
once the points were determined it was discovered that there was no 
difference observable to the eye! In this computer experiment the 
quartic equilibrium is essentially as good as the quadratic one in 
minimizing MSE. This virtual equality is stronger than what is re- 
quired of the quartic algorithm. We only desire that the quartic 
equilibrium be close enough to the quadratic equilibrium so as to create 
the option of switching to MMSE evolution once good decisions are 
available. 


X. DISCUSSION 


We have delved into the theoretical aspects of quartic algorithm. 
The results obtained included stability with respect to tap initialization © 
(Section V), equalization including acquisition of phase in systems 
employing highly stable oscillators (Section VII), and cancellation of 
crosspolarization interference (Section VIII). Section VI discussed 
convergence times which are of interest in bootstrapping digital radio 
systems. Section IX suggests that with centering, and with enough 
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Fig. 3—(a) Steady-state mean-square error versus number of taps for quartic crite- 
rion. (b) Power transfer characteristics of channels used in computation of mean-square 
error.) for quartic algorithm. 


taps, the finite tap equalizer may perform essentially as well as the 
infinite tap equalizer. 

It is premature to answer the question of whether digital radio 
systems should be designed to provide for self-alignment without 
inserting media probing signals in the transmitted signal. However, 
the results that have been elucidated thus far are promising enough to 
warrant further study. 

The electronics needed for implementation needs to be assessed. 
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Ideally, one would like to update taps every symbol interval. Then 
convergence speed is high and the operation of the algorithm is less 
vulnerable to the assumption that the channel is unchanged during 
booting. It is conceivable to implement such an algorithm with special- 
purpose hardware. However, this now may prove unrealistic from an 
economic standpoint. (In time the economic issue will disappear.) One 
could slow the algorithm, updating every 10 or 100 symbols, to obtain 
a more realistic implementation. By slowing the algorithm, conver- 
gence speed, rather than economics, becomes the issue. The question 
of how fast an algorithm is needed is particularly difficult to address, 
especially for cross-polarization cancellation, because of the lack of 
data on coupling dynamics. One could strive to compensate for slow- 
ness by developing good, adaptive, step-size algorithms. These are all 
items for future study. 

The best approach for future investigations of the usefulness of 
quartic algorithms in digital radio would seem to require inclusion of 
simulation and/or experimentation aimed at specific applications. 
Ultimately, it is necessary to include additive noise in such evaluations. 
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APPENDIX 
Even-Order Moments of QAM Constellations 


The moments E | a;|* for constellations with L? points are required 


in the paper. A general procedure for calculating E'| a;|“ (N even) is 
as follows: Write a recursion in L for my(L) & E(Re a;)%. The 
recursion can be solved using transforms. Straightforward algebra can 
be used to obtain E'| a;|‘ from my(L), e.g., 


E|a;|* = 2m, 

E|a;|* = (m, + m3) 

E|a;|° = (me + 3m4mz) 

E|a;|& = 2(mg + Gans + 3m2). 
[2 


Normalizing E'| a;|° = 1 and following the above suggestion for N = 4 


gives 
7L? — 13 
oS See 
E|a;| BL? = 1) 


For a constellation with a large number of points it is useful to have 
the asymptotic (IZ — ©) form for the moments. The result is 





1 i. < AN 
Him mul = ig LEME 


Since E'| a;|2 = 1 we have A = ¥3/2. Therefore 
| E|a;|* > 7/5 
E|a;|° > 81/35 
and = E| a; |® — 747/175. 


These asymptotes were used in composing (7b). As a check compare 
E|a;|* in the exact and asymptotic form. Note E'|a;|* asymptotes 
quickly. For L? = 16 points E|a;|* = 1.32 as compared to 1.4 for an 
infinite point constellation. 
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Baseband Cross-Polarization Interference 
Cancellation for M-Quadrature Amplitude- 
Modulated Signals Over Multipath Fading Radio 
Channels 


By M. KAVEHRAD* 
(Manuscript received March 21, 1985) 


In this paper we propose a novel baseband structure capable of adaptively 
mitigating cross-polarization interference in a dual-polarized, M-state quad- 
rature amplitude-modulated received signal. We show that by using this 
canceler, performance signatures very close to single-polarized system signa- 
tures can be achieved for dually polarized digital radio systems. 


I. INTRODUCTION 


Because of frequency reuse via orthogonally polarized channels, 
dual-polarized transmission of M-state Quadrature Amplitude-Modu- 
lated (M-QAM) signals can double the bandwidth efficiency of terres- 
trial radio routes. Such systems transmit two different information 
signals of the same bandwidth and the same carrier frequency by using 
orthogonal field polarization for the transmission of each signal. 
Nonideal antennas and transmission media cause cross-coupling of 
the two signals and cross-polarization interference. Cross-polarization 
interference cancellation using adaptive transversal filters over linear 
dispersive multipath channels has been the subject of considerable 
prior investigation.'* 

In this study we deal with cross-polarization interference cancella- 
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tion and intersymbol interferance (ISI) equalization separately, and 
propose a novel method of cross-polarization cancellation for dual- 
polarized operation of M-QAM signals over dispersive fading channels, 
similar to those experienced in line-of-sight terrestrial radio applica- 
tions. The canceler operates at baseband and improves the dual- 
polarized system performance to very nearly the performance of a 
single-polarized system. 

The canceler design is based on a previous observation that the 
power loss associated with a cross-coupled signal subject to flat or 
mildly dispersive fading brings about an actual reduction in system 
outage time.° In this paper we use the model and results of Ref. 5 to 
introduce the canceler structure and evaluate its performance. To 
enable comparison in the absence of cancellation, we use a dual- 
polarized system performance signature (M-curve) as a measure in 
our evaluation.° 

In the following section we review the results of Ref. 5 briefly, and 
then introduce discussions leading to the realization of the canceler. 
In Section III the canceler performance for both dual-polarized 16- 
and 64-QAM radio systems is presented and the results are discussed 
in detail. 


Il. ANALYTICAL MODEL 


In this section we describe briefly the channel model and underlying 
assumptions germane to baseband cancellation, and then introduce 
the canceler model. 


2.1 Channel model 
The dual-polarized channel model is shown in Fig. 1. Two inde- 
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Fig. 1—Channel model for a dual-polarized system. 
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pendent data streams are separately modulated and transmitted co- 
channel, using orthogonal polarizations. Rummler’s multipath fading 
model, which assumes the presence of a single inband notch, is applied 
to both the copolarized and the cross-coupled interfering channel 
transfer functions.® In other words, fadings of both the desired copo- 
larized and cross-polarized interference channels are assumed to be 
the single notch type and independent. As depicted in Fig. 1 and 
derived in detail in Ref. 5, the received in-phase signal on the reference 
copolarized path is denoted by r;1(t); 


ri(t) = ay56{cos(¢)p(t) + prcos[(we — wort) + 7 + dilp(t — 7y)} 
+ ay » 5,{cos(¢1)p(t — RT.) + prcos[(w. — wor)t1 + 7 + Oil 
#0 


-p(t — kT, — 71)} 


+ ay x B){sin(d)p(t = kT,) + prxsin[(w, a Wo) TI tat ¢y] 


-p(t - kT, _ Ty} 
+ ay Y di{cos(¢: + Om)p(t — RT. — tm) 
k 


+ pycos{(w. — won)tu + 7 + G1 + On)p(t — RT; — tu — Tm)} 
+ ay >; BE{sin(¢y 7 On)p(t _ kT, _ Ta) 
k 


+ pysin{(we — won)tn + 7 + o1 + OnJp(t — RT; — tn — Tm)} 
+ Re{n;(t)}, (1) 


where Re{-} stands for real part. In this equation, (64, 64) i = I, II 
represent the real and imaginary parts of the complex-valued trans- 
mitted symbols on the two polarizations, I and II, at consecutive 
instants, kT,, k = 0, 1, 2, ---, where T, is a baud period. The Nyquist- 
shaping filter impulse response is denoted by p(t), and w, is the 
nominal carrier frequency. The parameters a;, p;, o;, 7;; 1 = I, II 
represent the flat fade level, fade notch depth, fade notch position, 
and relative delay between the two rays in each of the Rummler type 
multipath fading models (the reference copolarized and the corre- 
sponding cross-coupled interfering channels). Also, in eq. (1), tm and 
6,, account for any symbol timing or carrier phase asynchronism that 
may exist between the two polarized signals at the transmitter location. 

It might be worthwhile, at this point, to discuss the impact of 
transmitter local oscillators status on the theoretical modeling of the 
channel. Illustrated in Fig. 2 is a typical dual-polarized system trans- 
mitter configuration. As seen, there are three major sets of local 
oscillators in the transmitter system that can play an important role 
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in modeling a dual-polarized system. Namely, local oscillators used to 
provide clock timing to baseband sequences, IF local oscillators pro- 
viding carrier signals to modulators, and microwave upconverter os- 
cillators. Because we intend to introduce baseband cancellation in the 
following sections, receiver implementation is simpler if we synchro- 
nize all the transmitter local oscillators. In other words, we assume 
Tm = 0 and @,, = 0 and investigate the overall system performance. 
This assumption also results in improved performance of the cross- 
polarization interference canceler as was demonstrated in Ref. 5 for 
the general dual-polarized system performance signatures in the ab- 
sence of a canceler. 

The optimum phase between the modulator and the demodulator of 
the reference copolarized signal (i = I) for optimum timing is intro- 
duced by ¢;. Note that for a strong main polarization signal and 
because of the independence assumption on the cross-coupled signal, 
#1 is imposed on the latter by the copolarized signal demodulator.° 

As noted in eq. (1), the dispersive nature of the multipath channel 
is completely described by the superposition of four impulse responses, 
each weighted by an appropriate independent transmitted symbol 
state. These impulse responses for the kth transmitted symbol inter- 
vals are 


uit = a{p(t — kT;)cos(¢y) 
+ pip(t — RT, — 71)cos[(w, — wor)t1 + 7 + Oyj}, (2a) 
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Ugi = a{p(t — RT;)sin(¢1) 
+ pyp(t — RT, — 7)sin[(w, — wor)t1 + 7 + y]}, (2b) 
Uin = ay{p(t — RT, — tTm)cos(¢r + Om) + pup(t — RT; — tT — Tm) 
-COS[(We — Won)TH + 7 + br + Omj}, (2c) 


and 
Ug it = an{p(t — kT, _ Tm)sin(¢y — On) 


+ pup(t — RT; — tn — Tm)sin[(@. — won)Tu + 7 + G1 + Om}, (2d) 


where the variables have been previously defined. For the received in- 
phase part of the main polarization signal, eqs. (2a) and (2b) describe 
the distorted in-phase and quadrature-coupled signals of the reference 
copolarized transmitter, respectively, and equations (2c) and (2d) 
describe the corresponding signals from the cross-polarized interferer. 

To introduce the parameters that define the fading character of the 
interfering cross-coupled signal path, we associated with each inter- 
ferer fading event a triplet representing its dispersive fading status. 
This triplet is 


2 log = (dB), — 20 log |1 — pu|(dB), sfoulMH) | (3) 


where ay and a; represent the flat fade levels for cross-coupled and 
copolarized signals, respectively. In the triplet, the second term is 
dispersive fade notch depth, and Afon denotes fade notch position 
relative to the carrier frequency of the cross-coupled channel. (Notice 
that other definitions of notch depth can be found in the literature.*) 
For illustrative purposes, we demonstrate eqs. (2a) through (2d) in 
Figs. 3 and 4, an interferer of (—20, 0, 0) fade and two different fade 
conditions of the reference main polarization path (copolarized chan- 
nel). In Fig. 3 we illustrate the aforementioned impulse responses 
when a notch-centered fade of 10 dB is applied to the main polarization 
signal. Observe that since the fade on the latter is notch centered and 
6m = 0, Ug: and uUgy are both zero. In Fig. 4 an 11-MHz offset fade of 
7.5 dB is applied to the main polarization path, and even though the 
interferer has a flat fade, because of the phase ¢; imposed on it, u;11 
and Ug are nonzero Nyquist-shaped pulses with their relative posi- 
tions also determined by the phase and timing imposed on them by 
the dominant polarization signal. 

Now we define a decision variable which is a function of the desired 
symbol to be detected, intersymbol interference, cross-polarized inter- 
ference, and Gaussian noise. To evaluate the average error probability, 
first we derive the conditional error probability conditioned on the 
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Fig. 3—16-QAM signal, time-domain impulse responses for a notch-centered fade. 
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Fig. 4—16-QAM signal, time-domain impulse responses for an offset fade. 


composite interference. Then by applying moment-generating func- 
tions and the Gauss quadrature method, we determine the average 
error probability. The details of this procedure are explained in Ref. 
5. 

Using eq. (1), we then computed the performance signatures 
(M-curves) of the main (reference) polarization signal (i = I) that 
provide a locus of the fade notch depth (in dB) versus the relative fade 
notch position (in MHz) for a 10-3 average probability of error. In Fig. 
5 we illustrate the performance signatures of the main polarization 
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Fig. 5—Performance signature curves for dual-polarized 16-QAM radio. 


signal, that is, —20 log | 1 — p;| versus Af, where p; is the dispersive 
fade notch depth of the main polarization path, and Afo: denotes its 
fade notch position relative to the carrier frequency. Along the curves 
we have specified average signal-to-interference ratio at a selected 
number of points. As a reference we illustrate the signature of a single- 
polarized 16-QAM system, that is, aj = 0 and label it “1.” A comparison 
of curves labeled “2” through “4” for different fadings of the interferer 
in Fig. 5 reveals the aforementioned fact that the system outage time 
(area under the M-curve) is related to the net interfering power for a 
mild dispersive fading of the interfering signal on the cross-coupled 
path. For example, a comparison of curves 4 and 2 with the same 20- 
dB flat power levels and 0-MHz notch offsets reveals that curve 2 with 
a 5-dB inband notch fade, results in less outage time than the fade of 
curve 4 with no inband notch. Hence, the greater power loss associated 
with curve 2 leads to reduced outage, even though the intersymbol 
interference at the reference receiver in the case of curve 2 exceeds 
that of curve 4. 

Now consider curves 2 and 3. The data corresponds to identical flat 
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power levels and fade notch depths, with the notch position moving 
from 0 MHz (notch centered) to 11 MHz (near the band edge). The 
notch-centered fade causes less outage than the notch offset fade 
because the unfaded signal spectral energy at 0 MHz is much more 
than that near the band edge; hence, the relationship of curves 3 and 
2 is again that of diminished net signal power in the interferer resulting 
in a reduced outage. All these curves were computed for a 60-dB signal- 
to-noise ratio (s/n), 22.5-Mbaud symbol rate, T = 0.45 Nyquist filter 
roll-off, and a 16-QAM radio system. 


2.2 Canceler model 


It is well known that interference power is directly related to the 
area of the cross-coupled signal power spectral density. Thus, in dual- 
polarized operation, where the dual-polarized signals are transmitted 
cochannel, any reduction of the interfering signal power spectral 
density area leads to a decrease in the overlap area between the main 
and the cross-coupled signal spectral densities, and, hence, a reduction 
in the interfering power. Therefore, a cross-polarized interference 
canceler able to perform such a task should bring about an improve- 
ment in the performance of the dual-polarized system. It is also well 
known that the main lobe sample of a Nyquist-type pulse is propor- 
tional to the area of its frequency spectrum. Owing to this fact, we 
hypothesized improvements in the dual-polarized system-performance 
signatures, given that the main lobe of the cross-coupled interferer is 
canceled in time domain. This hypothesis proved to be correct and is 
discussed further in-Section III. 

A block diagram of the cross-polarized interference canceler and 
system equalizers is shown in Fig. 6. Decision feedback complex taps 
cancel the main lobe of the cross-coupled interfering signal adaptively, 
using preliminary estimates of the main-lobe. Least-Mean-Square 
(LMS) adaptation is recommended because it was shown’ that s/n 
degradation by some cross-polarization cancellation methods can be 
large and that the adaptive algorithm should take into account noise 
power minimization.’ This is known to be one of the salient features 
of the LMS algorithms.’ Because the proposed cancellation is per- 
formed at baseband, the difference between input and output of the 
detector slicer circuit (error signal) can be employed as the perfor- 
mance measure and it can be utilized by the LMS controller to derive 
the canceler coefficients. 

Note that in Fig. 6 the baseband canceler precedes the system 
equalizers. This is to prevent the equalizers from causing excessive 
dispersion in the interfering signal when attempting to equalize deep 
fade notches of the copolarized signal as is the case in combined cross- 
polarized cancellation and ISI equalization. Note that the system 
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Fig. 6—Cross-polarized canceler/equalizer structure. 


equalizers are used to mitigate intersymbol interference and cross-rail 
interference; therefore, they are not parts of the cross-polarization 
cancellation operation. 

To perform main-lobe cancellation in practice, preliminary esti- 
mates of the signal main lobe on one polarization must be subtracted 
from the opposite polarization received signal that needs to be delayed 
by the amount of time that it takes to estimate the lobe. 

In this theoretical study we assume the preliminary estimates of the 
signal main lobe are correct so that we can cancel the cross-polarized 
signal. In practice this assumption is valid in the steady-state mode of 
operation if some kind of bootstrapped algorithm is adopted. This, of 
course, adds to the circuit complexity. An obvious advantage of this 
method is that the preliminary decisions used to cancel the cross- 
polarized signal provide noise-free estimates; hence, less s/n degrada- 
tion is caused by coupling mechanism compared to feedforward meth- 
ods. 

In the next section we elaborate on the system performance signa- 
tures after cross-polarization interference cancellation as well as mak- 
ing comparisons to the signatures of the same system without cross- 
polarization cancellation that we use as base-line measures. 


Ill. CANCELER PERFORMANCE 


In this section we present the computed performance signature 
curves for dual-polarized M-QAM signals using the canceler described 
in the previous section. 

Results in the form of performance signatures are illustrated in 
Figs. 7 through 10. As we can observe, use of a single complex decision 
feedback tap to cancel the real and imaginary parts of the cross- 
coupled interferer main-lobe sample renders performance signatures 
practically identical to those of a single-polarized system, in dual- 
polarized operation. To elaborate on the required number of canceler 
taps, it should be obvious in this case that only a single complex tap 
is adequate to remove the main lobe of the interferer. This is because 
when there is no fading or when there is offset fading of the copolarized 
channel, the interferer main lobe always coincides, or approximately 
coincides, with the desired symbol main lobe, and only one complex 
feedback tap is necessary to remove it. However, in the case of midband 
fading of the copolarized signal, as seen in Fig. 7, the timing reference 
of the main path impulse response is offset from the peak of the main 
lobe of the interfering signal. Hence, the canceler does not perform as 
well for midband fades as it would for offset fades of the copolarized 
path. This is because, for offset fades, as the copolarized path fade 
notch moves toward the band edge, the timing reference of the overall 
impulse response moves toward the origin;° hence, the two peaks tend 
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Fig. 7—Canceler performance in dual-polarized 16-QAM radio for a flat fade on 
cross-coupled interfering path. 


to align. Of course, the remedy for the centered fade situation is to 
increase the number of the canceler transversal taps. 

In Figs. 7 and 8, all the curves were computed for a 60-dB s/n, 22.5- 
Mbaud symbol rate, I = 0.45 roll-off, 16-QAM radio; and in Figs. 9 
and 10, the signatures for 64-QAM radio were computed for a 66-dB 
s/n, 15-Mbaud symbol rate, and I’ = 0.45 roll-off. Note that, in all 
these computations, ISI equalization of the main polarization signal is 
left out. 

To further quantify the influence of interfering signal main lobe on 
the dual-polarized system outage performance, we present an example. 
In the 16-QAM radio case, for an interfering signal defined by the 
triplet (—20, 5, 0) and for a centered fade of 6-dB notch depth on the 
reference copolarized signal, samples taken from u;1, Ug1, Uin, aNd Ug 
at optimum timing points are listed in Table I. As observed, the peak 
sample of the interferer impulse responses, u; and u,n, have ampli- 
tudes about ten times larger than the sum of absolute values of all 
their other samples taken every baud period. Note that, since 6,, and 
Tm are zero, the imaginary part of the interferer impulse response, Ug.n1, 
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Fig. 8—Canceler performance in dual-polarized 16-QAM radio for a dispersive fade 
on cross-coupled interfering path. 


is zero because of the notch-centered fading of the main polarization 
signal in this case, that is, ¢1 = 0 (see Fig. 3). To ensure validity of 
the test, for the same interferer fading conditions, we repeated this for 
several different fading conditions of the copolarized signal and 
checked the resulting impulse responses, u;y and u,1. In all the cases 
considered, samples of the real and imaginary parts of the interfering 
signal main lobe ranged somewhere between eight to ten times the 
value of the sum of the absolute values of all other samples. Thus, the 
contribution of the main-lobe sample of the interferer to the total 
peak distortion is much stronger than that of all other samples. 
Therefore, canceling the real and imaginary parts of the interferer 
main lobe should improve the performance significantly, as expected. 


IV. CONCLUSIONS 


In this paper we proposed a novel baseband cross-polarization 
interference canceler structure that adaptively mitigates the interfer- 
ence in a dual-polarized M-QAM radio system. Employing perform- 
ance signatures (M-curves) for dual-polarized systems, as introduced 
in Ref. 5, we showed that for synchronous transmitters, a single- 
decision feedback complex matrix tap canceler can enhance a dual- 
polarized system availability time to values close to the availability 
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Fig. 9—Canceler performance in dual-polarized 64-QAM radio for a 20-dB flat fade 
on cross-coupled interfering path. 
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Fig. 10—Canceler performance in dual-polarized 64-QAM radio for a 30-dB flat fade 
on cross-coupled interfering path. 


Table |—Impulse response samples of received in-phase signal on 


reference polarization 


UjI Ug Ui Ug Time 
0.5462E-04 0.3645E-19 0.75382E-05 0.1074E-19 to + OT, 
0.1560E-04 0.1690E-18 —0.7399E-06 0.9008E-20 . 

—0.1268E-03 —0.1694E-18 —0.1594E-04 —0.2812E-19 . 
0.3483E-04 —0.2848E-18 0.1042E-04 —0.4691E-20 . 
0.3412E-03 0.6616-18 0.3914B-04 0.8339E-19 . 

—0.3989E-03 0.3152E-18 —0.6560E-04 —0.5668E-19 . 

-0.1374E-02 —0.3568E-17 —0.1411E-03 —0.3695E-18 . 
0.7526E-02 0.9137E-17 0.9628E-03 0.1635E-17 to + 27, 

—0.2679E-01 —0.2553E-16 —0.3555E-02 —0.5556E-17 to + T, 
0.5328E+00 0.1171E-14 0.5860E-01 0.1354E-15 to 

—0.1332E-01 —0.1469E-15 0.6797E-03 —0.7793E-17 ty — T; 
0.2236E-02 0.3376E-16 —0.2800E-03 0.1649E-17 to — 2T, 

—0.1519E-02 —0.9138E-17 —0.6131E-04 —0.6031E-18 . 
0.5787E-03 0.4275E-18 0.7905E-04 0.1154E-18 
0.8068E-04 0.1226E-17 —0.1024E-04 —0.5979B-19 

—0.1865E-03 —0.3837E-18 —0.2098E-04 —0.4639E-19 
0.2264E-04 —0.2996E-18 0.8858E-05 —0.7337E-20 
0.7182E-04 0.2201E-18 0.6764E-05 0.2058H-19 . 

—0.2951E-04 0.7752E-19 —0.5842E-05 —0.2163E-20 to - OT, 


time for a single-polarized radio system, for the assumed propagation 
model. 


REFERENCES 


1 


A oo *& WD WwW 


. C. A. Baird and G. Pelchat, “Cross Polarization Techniques Investigation,” Harris 
Corporation Report No. RADC-TR-77-244, July 1977. 
. B. E. Gillingham et al., “Cross Polarization Interference Reduction Techniques,” 
Harris Corporation Report No. RADC-TR-79-154, June 1979. 
. J. Namiki and S. Takahara, “Adaptive Receiver for Cross-Polarized Digital Trans- 
mission,” Int. Conf. Commun., June 14-18, 1981, Denver, Colorado, Paper 46.3.1. 
. M. L. Steinberger, “Design of a Terrestrial Cross-Pol Canceler,” Int. Conf. Com- 
mun., June 1982, Philadelphia, pp. 2B.6.1-5. 
. M. Kavehrad and C. A. Siller, private communication. 
. W. D. Rummler, “A New Selective Fading Model: Application to Propagation Data,” 


B.S.T.J., 58, No. 5 (May-June 1979), pp. 1037-71. 


. M. Kavehrad, “Performance of Cross-Polarized M-ary QAM Signals Over Nondis- 
persive Fading Channels,” AT&T Bell Lab. Tech. J., 63 (March 1984), pp. 499- 
521 


8. J.G. Proakis, Digital Communications, New York: McGraw-Hill, 1983. 


AUTHOR 


Mohsen Kavehrad, B.S. (Electrical Engineering), 1973, Tehran Polytechnic 
Institute; M.S. (Electrical Engineering), 1975, Worcester Polytechnic Insti- 
tute; Ph.D. (Electrical Engineering), 1977, Polytechnic Institute of New York; 
Fairchild Industries, 1977-1978; GTE, 1978-1981; on the faculty of North- 
eastern University, 1981-1984; AT&T Bell Laboratories, 1981—. At AT&T 
Bell Laboratories Mr. Kavehrad is a member of the Communications Methods 
Research Department. His research interests are digital communications and 
computer networks. He is a Technical Editor for the IEEE Communications 
Magazine. He established and was the Chairman of the IEEE Communications 
Chapter of New Hampshire in 1984. Member, IEEE, Sigma Xi. 


1926 TECHNICAL JOURNAL, OCTOBER 1985 


AT&T Technical Journal 
Vol. 64, No. 8, October 1985 
Printed in U.S.A. 


Performance of Low-Complexity Channel 
Coding and Diversity for Spread Spectrum in 
Indoor, Wireless Communication 


By M. KAVEHRAD* and P. J. McLANE? 
(Manuscript received January 30, 1985) 


The application of selection diversity in conjunction with simple channel 
coding is considered for a multiuser, slowly fading, Spread-Spectrum Multiple 
Access (SSMA), digital radio system. For the most part, the index of perfor- 
mance for our study is the average bit error probability; we also give some 
consideration to multipath outage as a performance measure. All subscribers 
are assumed to communicate to a central station; that is, a star network 
architecture is assumed. Average power control is also assumed. The average 
mentioned in this context includes averaging over the channel fading statistics. 
The modulation is direct-sequence, spread-spectrum, binary phase-shift key- 
ing. We assume perfect timing and carrier recovery in our coherent receiver, 
and a slowly varying, Rayleigh fading, discrete multipath model is used. 
Previous analyses have found that SSMA can tolerate few simultaneous users 
for fading radio channels. We find that the combination of spread-spectrum 
modulation with low-complexity diversity and/or channel coding can restore 
fading-channel user levels to an acceptable figure. In addition, selection 
diversity plus channel coding is more effective than either method by itself. 
Finally, it turns out that SSMA is less sensitive to a change in the value of 
delay spread of a fading channel than, say, time-division multiple access. The 
method of moments is used to accurately assess the system error probability. 
Using this technique, we also assess the accuracy of assuming that the 
multiuser interference has a Gaussian distribution, which allows it to be 
analyzed by a simple method. Using this assumption, we compare selection 
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diversity plus channel coding with the maximal-ratio-combining technique for 
diversity reception. Except for a high order of diversity, the former is more 
efficient and is always less complex than the latter. 


I. INTRODUCTION 


In a recent paper Kavehrad has presented a technique to evaluate 
the performance of direct-sequence, spread-spectrum binary phase- 
shift-keying modulation for an Indoor, Wireless Communication 
(IWC) channel.' The analysis uses the method of moments,” which 
gives accurate estimates of error probability for many digital commu- 
nication systems. Kavehrad did not consider diversity in his study. 
We find that Kavehrad’s formulas are only slightly modified when 
selection diversity (see pages 313 through 316 of Ref. 3) is included in 
his reception model. We also determine the effect on system perfor- 
mance of simple channel-coding techniques. Our use of channel coding 
in spread-spectrum systems is similar to the case reported in Ref. 4, 
which involved frequency hopping. Both the (7, 4) Hamming code and 
(15, 7) BCH code are considered in our analyses. We assume hard 
decisions are made by the demodulator and that its error-producing 
mechanism results in independent error events. The latter assumption 
requires interleaving at the transmitter and de-interleaving at the 
receiver as a slowly fading channel model is considered. As the in- 
tended application is to packet transmission, interleaving does not 
present a severe system problem. A discussion of interleaving is given 
in Appendix 3A of Ref. 5. 

The references to the channel-coding aspects of our study are 
important, as channel coding is found to be an effective method of 
combating multiuser interference in Spread-Spectrum Multiple Access 
(SSMA) systems. This was earlier found by Livine for no signal 
fading.® In an earlier study Turin’ found that SSMA can tolerate 
considerably less multiuser interference in a fading channel than can 
be allowed in an additive white Gaussian noise channel. Adopting a 
Rayleigh fading model that seems less severe than the model used by 
Turin’ for mobile communication applications, it is found that channel 
coding plus selection diversity performs well in a multiuser environ- 
ment because the combination can be optimized. Channel coding used 
with selection diversity is found to perform better than selection 
diversity alone for the same spread-spectrum system bandwidth. In 
this sense it is both power and bandwidth efficient, as can be deduced 
from the similar study of Milstein et al.* in the absence of fading. This 
is true for the simple block codes mentioned above. Using more 
powerful codes and/or soft-decision decoding would give even greater 
gains in performance. Our approach has been to adopt a simple 
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Fig. 1—A star-connected, indoor, wireless, local area network. Each user has a unique 
spread-spectrum code. 


detection and decoding system to observe how a relatively simple 
system performs. 

For this study we assume an indoor, wireless communication chan- 
nel offering both voice and data services. The speech transmission 
rate for each user assumed in our system parameter study for IWC 
systems is 32 kb/s. Packet transmission is assumed and all users 
communicate through a central station in a star network architecture. 
Figure 1 is a simple block diagram of the system we analyze. Each 
active user in the system depicted in Fig. 1 has a unique spread- 
spectrum code, which is used for communication to the central station. 
The central station contains a bank of spread-spectrum receivers, one 
for each active user. Its function is to determine which subscribers are 
active and to detect the digital information sent in each case. The 
basis of the spread-spectrum receiver for each active user that exists 
in the central station is a Surface Acoustic Wave (SAW) device. Such 
devices have been found to be effective in such a role in the earlier 
study of Freret et al.° A tutorial on SAW devices can be found in Ref. 
9. We note that the study in Ref. 8 proposed the use of spread- 
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spectrum diversity in IWC systems. This diversity is inherently sup- 
plied by the spread-spectrum modulation as long as the spread-spec- 
trum bandwidth exceeds the coherence bandwidth of the slowly fading 
channel (see Ref. 10, page 480). More discussion on this point is 
presented in Section II. We note that spread-spectrum modulation 
can provide both asynchronous multiple access and also diversity 
reception.”!° Other advantages of spread-spectrum in IWC systems 
are discussed by Kavehrad.’ 

We present an analysis of the bit error probability for the link 
between any active user and its receiver in the central station. This 
link is shown in Fig. 1. As such, we are only considering the commu- 
nication upstream from an active user to the central station. The 
downstream communication path is much simpler and will not be 
considered. We assume that average power control is used by all active 
users in that, on the average, all active user signals are assumed to 
arrive at the central station with the same power (in the upstream 
communication mode). The average here includes the fading statistics. 
Thus, the power control that must be used by each active user just 
depends on the distance and power law exponent for the link from a 
user to the central station and also on the static, shadow fading that 
is encountered. The sources of signal fading are nicely summarized in 
Section II of Ref. 11. We do not consider the dynamics of average 
power control in this paper. 

The main contribution of the memorandum is to show that Kaveh- 
rad’s analyses’ can be extended to include selection diversity, and that 
this form of diversity can be used in conjunction with simple channel 
coding to give an SSMA system with an acceptable number of active 
users. For instance, for an IWC system having a multipath delay 
spread, T,,,, of 100 nanoseconds, we find that a spread-spectrum code 
length of 255, a source data rate of 32 kb/s, and a (15, 7), double-error- 
correcting BCH code can support approximately 75 simultaneous 
users. If we envision a low-traffic office environment with a 10-percent 
channel utilization, a total of 750 subscriber terminals can be sup- 
ported using the aforementioned method. This assumes that each code 
is shared among a group of subscribers in a contention mode of 
operation. 

An outline of the paper is as follows. The system-fading and multi- 
path model is described in Section II. Section III outlines the use of 
the computational technique from Ref. 1 to compute the average 
system error probability. Section IV considers an approximate com- 
putational technique based on a Gaussian assumption for the multiuser 
interference. Section V considers simple channel codes and Section 
VI contains our numerical results. Section VII presents an application 
of our computational results to two IWC scenarios for local area 
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networks, as well as our consideration of multipath outage as a 
performance criterion. 


ll. SYSTEM MODEL 
2.1 Transmission model 


Our mathematical model will depend heavily on the model developed 
by Kavehrad.’ We will use the same notation and borrow heavily from 
Kavehrad’s earlier analysis. Consider the block diagram of the multi- 
user, IWC channel shown in Fig. 2. The structure is exactly as in Fig. 
1. However, the receiving systems for a single reference user, taken as 
user 1, are shown; for simplicity only a two-antenna system is depicted 
in Fig. 2. 

In Fig. 1 each active user has a code waveform that consists of a 
periodic (period JT) sequence of N, nonoverlapping rectangular wave- 
forms (called chips), each of the duration T, seconds. The length of 
the code waveform is T seconds, the reciprocal of the symbol trans- 
mission rate, where T = NT,. The sequence of chip waveforms is the 
spread-spectrum code waveform, which for the kth user is denoted by 
a,(t). The data signal is binary with data symbols b?, where the 
subscript denotes the jth time slot and the superscript denotes the 
data symbol for the kth user. If we let P(t) denote a rectangular pulse 
of unit height and duration T, the transmitted signal for the kth user 
is 


S,(t) = Aa,(t)bz(t)cos(wet + 8.) (1a) 
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a,(t) = 2 a Pr(t — iT.) (1b) 
and 


oo 


b(t) = Y bj Pr(t —jT), (1¢) 
jeneo 

where a? is the ith chip amplitude for the kth user, where w,T' = 2r 
times an integer, w, is the carrier frequency in rad/s, A is the signal 
amplitude, and 4, is the signal phase. In our analysis we shall have k 
= ],2,-.--,K, where K will denote the number of simultaneous users. 
In Fig. 2 we show L discrete multipath links between each user and 
each receive antenna at the central station. The low-pass equivalent 
impulse response of the passband channel for the link between the 

kth user transmitter and central station receiver is 


L 
h,(r) = Dy BrO(t — tx)e?*, (2) 


where for the kth user 
8, = ¢th path gain 
®,, = “th path phase 
java 
and 
T/p = ¢th path time delay. 


We assume (;, is a Rayleigh random variable; ®,, is taken as uniform 
in [0, 27]; and 7, is assumed uniform in [0, T'], where T is the data 
symbol interval. In the sequel the difference between the maximum 
and minimum values of 7,, will be called the maximum multipath 
delay spread and will be denoted by T,,. Also, as a slowly fading 
channel is assumed, the variables in eq. (2) are assumed random but 
time-invariant. 

The impulse response given in eq. (2) is characteristic of a discrete 
multipath channel and has the same functional form as that given in 
Ref. 12. The question is, How do we determine L in terms of commu- 
nication system parameters? The basic result on the time resolution 
of signals using spread-spectrum signals is given in Section 1.5.3 of 
the recent textbook by Simon et al.° As one would expect, two signals 
must be separated by one chip time, T,, in order to be resolved. We 
illustrate this in Figs. 3 and 4. Figure 3 is a system block diagram plus 
a diagram depicting the discrete multipath model. Figure 4 shows the 
response due to L = 8 multipath components. We assume that the 
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Fig. 3—Multiuser spread-spectrum system for (a) a baseband system and (b) a 
discrete multipath model. 


maximum multipath delay spread, T,,, is less than T, the information 
bit interval, in order to avoid intersymbol interference. Using the 
result of time resolution of direct-sequence, spread-spectrum signals 
given above, one sees that 


b= [Pea = Bal +1 (3) 


is the maximum number of resolved paths for a maximum multipath 
delay spread of T,,, seconds. Also, in eq. (3) |x] is the largest integer 
that is less than or equal to x and B,, = NRo, the one-sided bandwidth 
of the spread-spectrum signal, where Ro = T™' and T/T, = N is the 
sequence length. Copies of the transmitted signal that arrive at unre- 
solvable time differences are assumed to combine to give rise to the 
Rayleigh path gain of eq. (2). As such, we should assume that the time 
difference, 7; — t,, is greater than T., where each individual 7 is 
uniform in (0, T). We take 7; — 7, > 0, which approximately is true as 
T,. = T/N is small relative to T. 

Actually, we feel that L in eq. (3) represents the maximum number 
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of approximately uncorrelated terms one could have in eq. (2). This is 
because L is based on the minimum resolution of direct-sequence, 
spread-spectrum signals. A random model is one in which L varies 
between unity and the maximum value given in eq. (3). 

Our approach to treating L in the paper will be as follows. Up to 
the IWC parameter study presented in Section VII, we determine 
system performance in general for any K, L, and M. Here K is the 
number of simultaneous users, L the number of paths, and M the 
order of diversity. In Section VII we then adopt two models for L. In 
one, L is given by eq. (8). In the other, L varies uniformly between 
unity and the maximum value given in eq. (3). The results turn out to 
be insensitive to the model used for L. As better models for L evolve, 
our general results can be used to estimate performance in such cases. 

If we combine eqs. (1a) and (2) and use the convolution integral, 
the received signal at the central station, which will be denoted as 
r(t), is given by 


K foo] 
r(t) = Re 13 ‘) hi(7)S,(t — rexp jest} + n(t), (4) 


k=1 


where S,(t) is the complex envelope of S(t) for 6, = 0 and Ref{-} 
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denotes the real part of a complex number. Upon use of eqs. (1), (2), 
and (4), we have 


L 
r(t)=A Y Baa(t — t21)bi(t — 771)cos(w.t + &,1) 
/=1 


Lx 
+A YY DY Beraxlt — tx)be(t — rx)cos(wet + €,) + n(t). (5) 
7=1 k=2 
The white Gaussian noise, n(t), in eq. (5) will have a spectral height 
of No/2 W/Hz. In eq. (5) 7, will be uniform in [0, T], ©, uniform in 
[0, 27], and 6,» will have the Rayleigh probability density function 
(pdf) 


fs(x) = — exp (=) u(x), (6) 
Po 2po 


where u(x) is the unit step function, u(x) = 1 for x = 0 and zero 
elsewhere. As such, the average received signal-to-white-Gaussian- 
noise ratio is 


E 
yo = E(B;) A = 29oE;s/No, (7) 


where E, = A?7/2, the signal energy per bit, and E(65) = 2p0, where 
for user one §;; is the random gain of the jth signal path. In fact yo = 
E(y), where y = Bs E,/No has the exponential pdf 


fy(y) = yo'exp (=) u(y) (8) 
Yo 
with yo as in eq. (7). 

The specification of our channel model is now complete. Note that 
the formulation represented by eq. (5) pertains to only the discrete 
multipath model whose impulse response is given by eq. (2). We note 
that the transmission model is similar, except for the specification of 
the fading parameters, to that used by Pursley.’° In Ref. 14 multipath 
diversity reception is considered. However, a Gaussian assumption is 
used in the performance computations. 


2.2 Receiver model 


The input to the receiver for the reference user is given by the right- 
hand side of eq. (5). The first term in this equation represents all the 
copies of the transmitted, spread-spectrum signal that are available 
for detection. Let us assume that the receiver can ideally lock on to 
the term at delay 7;, and phase ®;,. Then the jth decision variable for 
detection is given by 
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T 
§ = f r(t)a, (t = Tj )COS(wet + ®;, dt. (9) 
If the order of diversity used in the receiver is M, we will have §,j = 
1, 2, --- , M decision variables available for detection purposes. Sub- 


stituting eq. (5) into eq. (9) yields the result 


AT Al 
& = 66 — Ba +— Y Bacos[bn — Bi] 
2 2 21 


Ay 


T 
"| a(t — tr)bi(t — tr )ai(t — Tj )at 
A L K 
uae: yi 2X B-,cos[Py, — Pj] 
I: 
af a(t — Ten)bp(t — trr)a(t — Tj )at 


T 
+ f n(t)a, (t = Tj1)COS[wet + ®;, |dt, (10) 


where bd is the data bit to be detected. If one consults Ref. 1, it is clear 
that our eq. (10) is equivalent to eq. (8) of that reference with one 
difference; that is, in our eq. (10), Z fading paths are assumed for each 
interferer, as it is important in this work. Following this same refer- 
ence, one can express eq. (10) in the form 


AT A Z 
&} = Ba “9 70 bo 3 B-icos[®,, — 1] 
aa 


[6 Ri(ta) + bb Ru (t1)] 


+ Ass > 5 Bncos[P-, — Bi] 


2 Z=1 k=2 
-[b*Res(tee) + b’Ra(te)] + 2, (10a) 


where tz = tz — Tj1, v is Gaussian with zero mean and variance 
NoT/4, 


Ria(t) = ij: ax(t — 7)ai (t)dt, (10b) 


and 
T ; 
Rix (7) _ f a, (t = 7)a,(t)dt. (10c) 
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Although eq. (10) is rather long, each term in this equation can be 
interpreted with respect to the block diagram shown in Fig. 2. The 
first term in eq. (10) represents the desired signal to be detected. The 
second term in eq. (10) is the self-interference for the reference user 
(say, User 1 in Fig. 2) due to sidelobes of the autocorrelation function 
of the spread-spectrum code of User 1. The third term in eq. (10) is 
the L(K — 1) multiuser interference terms from the K — 1 other 
simultaneous users of the system. Finally, the last term in eq. (10) is 
the Gaussian random variable due to additive white Gaussian noise. 
Note that there are L — 1 self-interference plus L(K — 1) multiuser 
interference terms in eq. (10). Thus, the total number of interference 
terms isn = L(K-—1)+L-—1=LK-—1=LK fora large LK. We will 
find in our computations that the per-user average error probability 
is, for all practical purposes, a function of n = LK. We note that & in 
eq. (10a) could correspond to any diversity term for the receiving 
system shown in Fig. 2. For instance, if there are two antennas and 
L = 2 (see Fig. 2), we have two antennas and two paths per antenna 
to give a total order of diversity of M = 4. 

As we have noted previously, our eq. (10) is equivalent to eq. (8) of 
Ref. 1. Also eq. (10a) is equivalent to eq. (10) of this reference. 
However, in our detection procedure we assume that selection diversity 
is used. That is, we can find &; in eq. (10) such that 6, is the largest 
path gain relative to User 1, so that 


B, = Max(6i1, Ba, --+, Ban). 


Let & be that value of & corresponding to path gain 6,. Then our eq. 
(10) with & replaced by & is equivalent to eq. (8) of Ref. 1, except that 
@, has the pdf of the maximum of the path gains B;,j =1, 2,---, M. 
As shown by Jakes,’ pages 313 through 316, or Papoulis, pages 139 
and 140, the pdf of 87, the maximum of the 8%, is easily found. It is 
this pdf that will be needed in our error probability analyses. 


Ill. ERROR PROBABILITY 


We will return to the subject of the pdf of 6? later. Recall that our 
eq. (10) is equivalent to eq. (8) of Ref. 1. If we mimic the development 
through to eq. (22) of this reference, we find that the probability of 
error, conditioned on §?, the self-interference, and the multiuser 
interference is given by 


2 
Pie| Bi, x, n= et] \/ 52 - Bes ah (11) 
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where 


erfc(u) = al exp(—y”)dy 


T 
jee pa 
ae z Ba {bi Rulta) + Ru(ta)}cosV, (12) 
LAj 
12K 
t= 7 Zz Py Brlb®s: Ris (tee) + b’Re(t-x)}cos Oz, (13) 


where tr=TA:7 Tjly V- = ®,, as Pa, and Oy = Py, = Pi. The index 
in, say, Tj1, is taken as the delay for the path having the largest 8). 
The largest 8; is denoted as 6, and this random variable is independent 
of B;,, i ~ J. Thus x and z, which are mutually independent, are also 
both independent of 6,. 

Kavehrad’s! technique of finding the average error probability, P(e), 
was to integrate P(e| 61, x, z) with respect to the pdf of 6? and then 
remove the conditioning on (x, z) using the method of moments. 
Actually our result for P(e) in the case of selection diversity can be 
deduced from Kavehrad’s mathematics. Let the pdf for 6, be the 
Rayleigh pdf given in eq. (6). The pdf for 8? is 


faly) => 4 exp|- +I u(y) (14) 
with pé = 20 = E(8?). In eq. (11) we have 
P(e|x, z) = i fe(y)P(el Bi, x, z)dy = K (2, Yos D), (15) 


where Yo = poE,/No, D =x +z, 


_n2 
K(v, yo, D) = = = erfe[ vvD] -5 \ / ar 7 &xP a | 
-erfc - \ / are = (16) 


and v = E,,/No. This result is the same as in eq. (30) of Ref. 1. 
It turns out that a result similar to eq. (14) is obtained when £3 is 








the maximum of the 63, j = 1, 2, , M. If all the 6j:’s are Rayleigh 
with E(8%,) = 2po = po, 
M-1 
fag(y) = (1 — exp 7 = ' exp = 4 u(y), (17) 
po 20 
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as follows from eq. (5.2-7) of Jakes.*? Using the binomial theorem in 
eq. (17), 


_ _1)\2% 
faly) = My ae ) rea ~ exp[—y/por]u(y), (18) 


where pox = po/(k + 1) and 


N\. MN! 
) ~ (N — k)!R!" a 


The evaluation of the integral in eq. (15) for the pdf in eq. (18) will 
just involve a summation of the K(.-,-,-) function of eq. (16). This 
follows as the pdf in eq. (18) is just a summation of the exponential 
pdf’s, (pdx) exp(—y/pé,). Thus, use of eqs. (14) and (15) in the evalu- 
ation of the integral in eq. (15) for fs2(y) in eq. (18) yields 


_ 4, (M-1)\ (-1)* Ey _ Yo 
Pelx,2)=M S ( b jor «(2 ane .p), (20) 





where the K(.,-,-) function is as specified in eq. (16). This is the main 
mathematical result of this study. Kavehrad' showed how to average 
the K(-,-, D) function with respect to D = x + z. To remove the 
dependence of P(e| x, z) on D = x + z, one just carries out Kavehrad’s 
procedure once, to get the moments of D = x + z, and then evaluates 
the resulting K[-,yo/(k + 1), D] function for all yo/(k + 1). In 
mathematical terms, 


(-1) Ey _Yo_ 
P(e) = uy (Me 1) EO" Sw (2, amar a), (21) 


where (w;, {;) are the weights and nodes, respectively, of the Gauss- 
Quadrature algorithm (see Appendix C of Ref. 1). For M = 1 the result 
in eq. (21) reduces to eq. (32) of Kavehrad’s analysis in Ref. 1. 


IV. GAUSSIAN ASSUMPTION 


The Gaussian assumption is to take all the multiuser interference 
as Gaussian noise. We will base our calculation of the average error 
probability on eq. (10a), which is equivalent to eq. (10) of Ref. 1. The 
last term in eq. (10a), v, is a Gaussian random variable having zero 
mean and variance NoT/4. The first term in eq. (10a) is the signal 
term and it has average power, 87A”T?/4 for a fixed 6,. The rest of 
the terms in eq. (10a) are all mutually independent. To calculate the 
total power of this term, we must evaluate a term like 


ae ee + sal) 


7 (22) 


INDOOR, WIRELESS COMMUNICATION — 1939 


where a; = +1, i = —1, 0, are independent binary variables. In eq. (22), 
e” was shown by Pursley’® to have the value 2/(83N), where N is the 
sequence length of the Gold codes considered in Ref. 16. 

There are approximately 7 = LK such expectations in eq. (10a). 
Hence, for a fixed 6; we have 


AT\ 
signal power = (47) Bi 


2 
interference power = 7 (47) «7 E(B?) /2 


and 
noise power = NoT/4, 


where 6 denotes a Rayleigh random variable and is any of the inde- 
pendent identically distributed random variables 6,,’s excluding 6,. 
The term E($”)/2 occurs in the interference power as 6 cos @ is 
Gaussian with zero mean and variance E(6”)/2 as 6 is uniform in 
[0, 2x]. Thus, with the Gaussian assumption, the error probability 
conditioned on #; is given by 


1 
Pe|&1) = 5 erfe(v). (23) 
Note that in eq. (23) y is equal to half the signal-to-noise plus 
interference power ratio; hence 


= BIE» 
Y ™ (LK) Ey E(B?) + No 


with ¢? = 2/(3N) given by eq. (22) and E, = A?7T/2. The average value 
of y is 


(24) 


_ Ey 
ne’Ep + No 
where E, = E(67)E, and n = LK. For No = 0 we have yo = 3N/2n, 
which is a result to be used in what follows. 
When £3 in eq. (24) has the pdf in eq. (14) with E(6?) = pn = po, 


Proakis’® in his textbook shows that the average of eq. (23) with 
respect to 6, is 


P(e) = p(yo) = , Ny oa (26) 


When selection diversity is used, the pdf for 67 has the form in eq. 


(25) 
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(18), which is just a summation of exponential pdf’s. Accordingly, 
P(e) for this case is just a sum of the terms in eq. (26), viz, 


_ ae (M-1\ (-1)* Yo 
Pey=M 5 ( k ) (A). 27) 








a result given by Sundberg.” 

A more complicated, but higher performance, form of diversity is 
Maximal Ratio Combining (MRC). Here the gain and phase of each 
signal term must be known. These gains and phases are then used to 
coherently combine individual path terms to form a single decision 
variable for the data detection process. The decision statistic, assuming 
perfect coherence, is just the summation of §;-&, j = 1, 2, ---, M, 
where é; is given in eq. (10). The error performance of this form of 
diversity can also be given in terms of yo in eq. (25). If the interference 
terms are assumed to be uncorrelated from one diversity branch to 
another, then the result is given in eq. (7.4.15) of Proakis’® text, viz, 


M-1 = 
P.= [plo © =) [1 — plvo)l" (28) 


where p(yo) is given in eq. (26). 

We have not yet determined the error performance for MRC using 
the method of moments. Therefore, for this form of diversity we will 
have to rely on the Gaussian assumption. We will estimate the accu- 
racy of the Gaussian assumption for the case of selection diversity and 
then apply this to the MRC case. 


V. CHANNEL CODING 


We shall be interested in the performance of two simple block 
channel codes used in conjunction with selection diversity. These are 
the (7, 4) Hamming code and the (15, 7) BCH code. The former 
corrects one channel error while the latter corrects two channel errors 
in a coded block. Such codes are discussed in the introductory text- 
books by Pless'® and Lin and Costello.” 

For a channel code that corrects t-errors, the bit error probability 
is given in eq. (25) of Milstein et al.* as 

Pu = : y 1 (") pe(1 — pe)”, (29) 
N j=t+1 L 
where for simplicity we denote the channel error probability in eq. 
(21), P(e), by p- and n is the coded block length. This is an approxi- 
mation and the assumption is made that channel errors are independ- 
ent. 
We have done some calculations with eq. (29) for the (7, 4) Hamming 
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code and the (15, 7) BCH code. However, the formulas given below 
are more precise, as they are based on the weight distribution of the 
codes used in our study. The two approaches never gave a difference 
in bit error rate of more than 66 percent. For the (7, 4) Hamming code 
we show in Appendix A that 


Py, = 9p2(1 — pe)? + 19p2(1 — p.)4 (30) 


for small p,. As this is a perfect code the result in eq. (30) is exact for 
small p-. The result in eq. (29) gives 6p? not 9p? for the first term in 
eq. (30) when p, is small. 

For the (15, 7) BCH code we show in Appendix A that for small p,, 


Pye = 150p2(1 — p.)'? + 512p2(1 — pe)™. (31) 


Some approximations are involved here, as the code is not perfect. 
However, the weight distribution of the code is considered. The for- 
mula in eq. (29) gives Ps: = 91p2 for small p.. 


VI. NUMERICAL RESULTS 


The computer programs developed by Kavehrad' for the method of 
moments were modified to incorporate the moments of interference 
terms in the new form in eq. (10a). The new program was simply 
adapted to perform the computations needed to evaluate eqs. (20) and 
(21). Our computations will be for the Gold sequences of length 127 
and the Kasami sequences of length 255. Initial loadings to generate 
these codes were taken from Ref. 20. 

Before discussing our numerical results let us just review the main 
parameters of our model. They are 


N = spread-spectrum sequence length 
L = number of multipath links 

M = number of terms used for diversity 
K = number of simultaneous users 


and 
‘Yo = average signal-to-noise ratio. 


In our computations we are interested in the case in which L is small, 
M is moderate, and K is large. For the most part, we will concentrate 
on the so-called® noise floor average error probability. This is the error 
probability when the thermal noise is absent; that is, when No = 0. 
Our computations were most easily done for1 <= K<1l5and1sLs 
30. 

We will examine the three hypotheses listed below. The first one 
follows. 
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6.1 Error probability is approximately a function of only 4 = KL 


Our hypothesis is that the error probability is approximately a 
function of only 7» = KL and not of K and L themselves. This is 
certainly true in the case when a Gaussian assumption is made, as can 
be seen by an examination of eqs. (25), (26), (27), and (28) of Section 
IV. The goal of our system analysis is to estimate performance for, 
say, K = 90 and L = 2. We wish to do this by computing the error 
probability for » = KL, with, for instance, K = 15 and L = 12, which 
also gives n = 180. 

The results of our computations for N = 255, the Kasami code, are 
shown in Table I. For error probabilities in the channel-coded cases, 
that is, Pj, and P,2, of around 10“ we see that, for the most part, we 
are in error at most by a factor of 2 or 3. This is an acceptable error 
factor for practical error probabilities. An error factor of 6 to 10 would 
not be acceptable in our view. This point takes us to our next 
hypothesis. 


6.2 Coding plus selection diversity is power efficient 


To evaluate the error performance when channel coding is used we 
take the result of the computation represented by eq. (21) and substi- 
tute it into (30) or (31). Equation (30) is for single-error correction 
and eq. (31) is for double-error correction. 

We will compare channel coding plus selection diversity of order M 
versus selection diversity alone of order 2M. This will be done for the 
single-error-correcting (7, 4) Hamming code. Thus P;,, is given in eq. 
(30). For large yo it is easily shown that P, in eq. (27), for selection 
diversity and a Gaussian assumption, is given by cyo™, where c is a 
constant. For various constants, c, see Table 1 of Ref. 17. Thus, using 
the (7, 4) Hamming code plus diversity takes M — 2M, since the 
channel error probability is squared to give the decoded error proba- 


Table |—The data to test the P. = f(n), » = LK, hypothesis for 


N= 255 

M n K L P; Po Pre 

4 60 10 6 0.41E-02 0.15E-03 0.98E-05 
4 60 6 10 0.50E-02 0.22E-03 0.18E-04 
6 90 15. 6 0.85E-03 0.65E-05 0.91E-07 
6 90 6 15 0.19K-02 0.32E-04 0.10E-05 
6 150 15 10 0.41B-02 0.15E-03 0.98E-05 
6 150 10 15 0.50E-02 0.22E-03 0.18E-04 
6 180 15 12 0.69E-02 0.41E-03 0.45E-04 
6 180 12 15 0.91E-02 0.57E-03 0.72E-04 
8 180 15 12 0.35E-02 0.11E-03 0.62E-05 
8 180 10 18 0.44E-02 0.17E-03 0.12E-04 
8 240 15 16 0.88E-02 0.67E-03 0.91E-04 
8 240 12 20 0.97E-02 0.80E-03 0.12E-03 
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bility and thus doubles the order of diversity. One more point is 
important in our comparison. Diversity and channel coding are similar 
in nature, since both try to exploit the redundancy in the transmitted 
signal. Our codes require bandwidth expansion to achieve this redun- 
dancy, but space diversity does not. Now the number of discrete 
multipath links between transmitter and receiver is given by eq. (3). 
Since the signal bandwidth, B,,, is smaller for selection diversity alone, 
we evaluate its performance when L is replaced by L — 1 for L = 2 or 
3, where the channel-coded system is taken to produce L multipath 
terms. Of course, for larger L’s, the former should be replaced by L — 
2, and so on. In any case, the uncoded system is subject to less 
interference than the coded system. The results of our computations 
are shown in Fig. 5. Also shown in Fig. 5 are two isolated points for 


10-2 
(7, 4) HAMMING CODE 
K = 15 
No = 0 
N = 255 


SELECTION DIVERSITY 
PLUS CODING 


——-— ONLY SELECTION DIVERSITY 


OF ORDER 2M 


104 


NOISE FLOOR BIT ERROR PROBABILITY 





0 1 2 3 4 5 6 7 
M—ORDER OF DIVERSITY IN CODED CASE 


Fig. 5—Comparison of selection diversity plus coding versus selection diversity alone. 
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coding alone, that is, M = 1. One notes that, except for low M, the 
combination of channel coding plus selection diversity is significantly 
better than using selection diversity alone. 

We note that selection diversity alone can saturate to a poor value 
of error probability as M gets large. From Jakes’ it is well known that 
the signal-to-noise ratio (s/n) performance of selection diversity be- 
comes poor relative to MRC as M grows. In the case of multipath 
diversity, performance becomes even worse. To see this point let us 
first consider MRC. Jakes® shows that, for antenna diversity of order 
M, the average s/n is Myo (see page 3.19 of Ref. 3), where yo is the 
average s/n for no diversity. Let the only source of noise for multipath 
diversity be the self-noise term in eq. (10). Accordingly, for multipath 
diversity alone yo must be changed to yo/M and the average s/n with 
MRC diversity is yo for all M. Thus, multipath diversity allows for no 
improvement in average s/n. In general, let there be Ls antennas and 
Ly multipath diversity terms giving M = Lg-Ly. In the sequel we 
always take Ly = L, the number of paths in our multipath model. 
Then the average s/n is Lsyo and the improvement is only due to Ls. 
With selection diversity Jakes® shows that the average s/n is 

B= 3e Ps 
ee eae 

For M = 2 we have E(y) = 0.75 yo as for multipath diversity alone, 
since we must replace yo by yo/2 due to self-interference. For M = 4 
with Ls = Ly = 2 we have E(y) = 2yo0 for MRC and we have E(y) = 
2570/24 for selection diversity, a loss of about 3 dB to MRC. Fortu- 
nately, the system error probability is not a function of E(y) alone, as 
it is a polynomial as yo’. In any case, for acceptable performance we 
will find that selection combining requires both antenna and multipath 
diversity. However, this may not be necessary for MRC or equal gain 
combining; only multipath diversity may suffice. 


6.3 Coding plus selection diversity is. power and bandwidth efficient 


Our next comparison involves the performance of coding plus selec- 
tion diversity versus selection diversity alone for the same system 
bandwidth. We did this by finding the selection diversity performance 
of the 127-length code (the Gold codes) with the (7, 4) Hamming code 
versus the length 255 code (the Kasami codes) with no channel coding. 
The result is plotted in Fig. 6. As Milstein et al.* found earlier, simple 
error-correcting codes are an effective way to improve the performance 
of spread-spectrum systems for the same system bandwidth. 


6.4 Performance: coding plus selection diversity 


The results of our computations using the method of moments are 
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Fig. 6—Comparison of coding versus no coding in the same spread-spectrum band- 
width. 


given in Figs. 7, 8, and 9. Figures 7 and 8 depict the noise floor error 
probability as No = 0. Figure 7 is for small L and Fig. 8 is for large L. 
These computations are for independent values of K, L, and M; M is 
not a function of L, as will be the case in Section VII. We are interested 
in large L and P, = 10 in our system analysis, which will be discussed 
in the next section, where extensive use will be made of the results in 
Figs. 7 and 8. Figure 9 presents the performance with No ¥ 0, which 
will not be used in the sequel, since for IWC systems analysis is 
completely based on the noise floor error probability. 


6.5 Computations: Gaussian assumption 


It is clear that computation of the system error probability when a 
Gaussian assumption is invoked is quite simple. This is true for either 
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(7, 4) HAMMING CODE 
N = 255 

K = 15 

No =0 

N= KL 


NOISE FLOOR BIT ERROR PROBABILITY 





1 2 3 4 5 6 7 8 
L—-NUMBER OF PATHS 


Fig. 7—Noise floor error probability for the (7, 4) Hamming code and various orders 
of diversity. 


selection or MRC diversity, as can be observed by referring to eqs. 
(27) and (28), respectively. Since the method of moments precisely 
computes the system error probability, we can assess the goodness of 
the Gaussian approximation. 

Our calculations are plotted in Figs. 10a and b. The Gaussian 
approximation underestimates the noise floor error probability. This 
gets worse as the number of interferers increases, a counterintuitive 
result based on intuition related to the central limit theorem. Around 
a 10~* error probability, however, the discrepancy is acceptable. Be- 
cause the Gaussian assumption underestimates the error probability, 
it will, for a fixed error probability, lead to an overestimation in the 
number of simultaneous users of a spread-spectrum multiple access 
system. We present the degree of this overestimation in Table II. In 
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Fig. 8—Noise floor error probability for the (7, 4) Hamming code and (15, 7) BCH 
code for various orders of diversity. 


this table both the Hamming and BCH codes are considered for two 
cases of diversity, M = 4 and 6. In both cases we are interested in that 
value of » = KL that gives a noise floor error probability of approxi- 
mately 10~*. To do this we consider the Gold code of length 127 and 
the Kasami code of length 255. In the case of a Gaussian assumption 
the system performance depends only on yo = 3N/(2n). Thus, if N is 
doubled, so should n for the same value of yo. The percent error in 
this assumption is shown in Table JI. For the orders of diversity of 
interest, the error is at most 20 percent. 

We have also done computations for MRC by invoking the Gaussian 
assumption. This procedure uses eq. (28) and just depends on the 
parameter, yo = 3N/(2n), n = KL. We then reduce the value of 7 
produced by this computation by 20 percent for, say, P, = 107+, as this 
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Fig. 9—Bit error probability as a function of E,/No for M = 4 and M = 6. 


was the error in the case for selection diversity. The results of a limited 
number of such computations will be discussed in the next section. 


Vil. IWC PARAMETER STUDY 
7.1 Tm = 100 nanoseconds 


Measurements by Saleh and Valenzuela” have established the mul- 
tipath delay spread in the Crawford Hill building at AT&T Bell 
Laboratories, Holmdel, New Jersey. The measurements indicate that 
the maximum delay spread is usually T,, = 100 ns. The distance over 
which these measurements were taken was approximately 300 ft. 

The application that interests us is for 32-kb/s digital speech. We 
take this as the source rate. The service supplied also will include a 
9.6-kb/s data service, and such a source would be channel coded up to 
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Fig. 10—The Gaussian assumption for (a) low orders of diversity and (b) high orders 
of diversity. 


Table II—Error in the Gaussian assumption when the 
sequence length is doubled. The calculations of 7 are for 
Pp = 1077 and » = KL. 


Gaussian Percent 


n n 

M Code 127 255 255 Error 
4 (7, 4) 60 105 120 14.3 

4 (15, 7) 90 150 180 20.0 

6 (7, 4) 80 150 160 6.7 

6 (15, 7) 120 210 240 14.3 


the 32-kb/s rate to provide the extra error protection needed for data. 
We focus on a threshold average error rate of 10°* for speech. For 
1000 bit packets this translates into approximately a 10-percent packet 
error probability. Valenzuela” and Wong et al.”? have developed in- 
terpolation schemes to handle such packet loss rates. In any case, for 
a 32-kb/s source rate, the bandwidths of the various spread-spectrum 
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Table III—Transmission parameters for 
different sequence lengths 
N No Coding (7,4) Code (15, 7) Code 


(a) Bandwidth in MHz of various spread-spectrum se- 
quence lengths for a source rate of 32 kb/s 


127 4.06 7.11 8.70 
255 8.16 14.20 17.50 
512 16.40 28.67 35.11 
(b) Number of discrete multipath components when 
m= 100 ns 
127 1 1 1 
255 1 2 2 
512 2 3 4 
(c) Number of discrete multipath components when 
Tm = 250 ns 
127 2 2 3 
255 3 4 5 
512 5 8 9 


systems that we will consider are given in Table IIIa. The IWC 
application here presumes overlay signaling,”* where spread-spectrum 
users’ signal coexists with that of users of other services in a lightly 
loaded part of the radio frequency band. 

A crucial parameter in our study will be the number of discrete 
paths, L, for a given maximum multipath delay spread and spread- 
spectrum bandwidth as predicted by eq. (3). Thus, for N = 255 in 
Table III and for T,, = 100 ns, we let L = 2. Other values of L are 
given in Table IIIb. From Fig. 8 we note that for the (15, 7) BCH code 
(double-error-correcting), we can have L = 14 for K = 15 in order that 
P, = 10°* when M = 6. As K = 15 we have n = LK = 210. We now 
invoke our assumption about the fact that the system error rate is, to 
a good practical approximation, just a function of 7 = KL. Thus, if L 
= 2 we can get K = 105 simultaneous users since 7 = 210. The order 
of diversity M = Ls-L, where Lg is the number of antennas and L the 
discrete order of spread-spectrum diversity. Therefore, Ls = 3 anten- 
nas at the central station is needed to support 105 simultaneous users. 

Let us see now how many simultaneous users the single-error- 
correction system can support. For M = 6 from Fig. 8 we get L = 10 
and thus 7 = 150. As L = 2, our assumption P, = f (KL) gives K = 75 
active users. This is for Ls = 3 antennas. 

With the sort of computation we have just outlined—through use 
of Fig. 8—and other related calculations, we can construct Table IV. 
The bandwidth efficiency measure given in Table IV will be discussed 
below. Also given are some performance estimates for the MRC form 
of diversity with no channel coding. When the spread-spectrum se- 
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Table 1V—The number of simultaneous users in terms of sequence 
length, N, and L,. The number of users is given in columns 3, 4, and 
5. The order of diversity is L,-L, where L is given in Table IIIb. We 
have set Tm = 100 ns, P, = 1077. 

Bandwidth Bandwidth 


Efficiency Efficiency 
(7, 4) (15, 7) MRC at (15, 7) MRC at 


N L, Code Code 2N Code 2N 
127 2 20 40 6 0.14 0.02 
255 1 23 38 6 0.07 0.01 
255 2 50 75 60 0.12 0.12 
255 3 15 105 132 0.19 0.25 
512 1 42 60 — 0.05 — 
512 2 80 108 _— 0.10 — 


quence length, N, is, say, 255, we have set N = 512 for MRC in order 
that the MRC system and selection diversity with channel coding 
system occupy the same bandwidth. The estimates for MRC were 
computed using the Gaussian assumption, Yo = 3N/(2n), and eqs. (26) 
and (28). Such estimates of K were then reduced by 20 percent in 
accordance with our earlier results on the Gaussian approximation. 
We have made this reduction in computing the data of Table IV. 
Subject to such estimates we note that M = 6 is needed for MRC to 
outperform the combination of selection diversity and channel coding. 

We note that the estimate for K for an N = 512 sequence length 
was determined as follows. The value of 7 = KL for N = 255 was 
doubled and the result was reduced by 20 percent. This is in keeping 
with our findings regarding the Gaussian assumption (see Table II). 

In Tables Va and b we present results when L is random. We let L 
vary from unity up to the maximum value, Lyx, given by the right- 
hand side of eq. (3). Each value of Lis taken to occur with a probability 
of 1/Lmax. We display the average value of K and also the value K 
corresponding to L = Lax in Tables Va and b. Actually, the value of 
K for each value of LZ, 1 Ss L S Dmax, differed only slightly from the 
average value of K. We find this invariance because as L decreases, 
the order of diversity, M = L,L, decreases, but so does the maximum 
number of interference terms, 7 = KL, thus giving rise to approxi- 
mately a fixed P,. Note that we cannot have only one antenna in the 
random model, since when L = 1 all diversity is lost. 

In Fig. 11 we have plotted the bandwidth efficiency of some of our 
schemes. This is given by 


_ K-(Code Rate) 
ey See 


where N is the spread-spectrum code length and K is the number of 


BE 
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Table V—A comparison of the estimate of the 
number of simultaneous users when L = Lmax, aS 
given by eq. (3), and when L = 1, 2, «++, Lmax, 
each with probability 1/Lmax.. Average K is over 


such L. 

Average K for 

N L, Das K for Imax 1S LS Linx 
(a) Data are for the (7, 4) code with T,, = 100 ns 

127 2 1 20 20 
255 2 2 50 50 
255 3 2 75 75 
512 2 3 80 79 


(b) A comparison of the estimate of the number of simultaneous 
users for the (15, 7) code 


127 2 1 40 40 
255 2 2 75 75 
255 3 2 105 112 
512 2 4 108 115 


simultaneous users as estimated using the procedure just described 
above. We note that the double-error-correcting system is 11 percent 
more bandwidth efficient than the single-error-correcting system. 
Tabular data on bandwidth efficiency are given in Table IV. Although 
the efficiency values in Fig. 11 are rather low, we remember that in 
overlay signaling the values represent the order of frequency reuse. 

We have also placed other points on our bandwidth efficiency plot 
in Fig. 11. These points are for less severe channel models than the 
one we consider. If we assume an Additive White Gaussian Noise 
(AWGN) channel, the performance follows from eq. (24). In eq. (24) 
let No = 0 so that we get the noise floor error probability and assume 
that the multiuser interference follows the Gaussian model. Further- 
more, let b = 67/E(67) and n = LK. Combining eqs. (23) and (24) for 
the bandwidth efficiency, there follows 


_ 3b 
~ 2Lferfe(2P,)}? ° 


In eq. (32) for P. = P;, = 107* we have BE = 0.22b/L. For the AWGN 
channel 8, = 6 = 1 and L = 1 to give BE = 0.22, which agrees with 
Turin’s’ result (see Table 1 of Ref. 7; our result is slightly higher, as 
we use coherent binary phase-shift keying rather than differential 
binary phase-shift-keying modulation). Thus, for the same bandwidth 
as used in Table IV, as we have 7 = 0.22, N = 512 gives K = 112 and 
for no coding or diversity. 

Another case of interest is when the signal term has a deterministic 


BE (32) 
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ADDITIVE WHITE 
s- —— GAUSSIAN NOISE 
CHANNEL 


(15, 7) BCH CODE 


(7, 4) HAMMING CODE 


BANDWIDTH EFFICIENCY 


P, = 10-4 
SELECTION DIVERSITY 
N = 255 
No =0 
Tm = 100 ns 
=2 
M=L L 





L,-NUMBER OF ANTENNAS 


Fig. 11—Bandwidth efficiency of error-correction-coded spread-spectrum systems. 
For N = 255, selection diversity alone requires L, = 8 to get an efficiency of 0.11. 


gain but the multiuser interference is subject to Rayleigh fading (see 
Case 1, Section 4.2 of Ref. 1). As stated in Ref. 1, this is the case that 
the reference transmitter is stationary and there is not much move- 
ment in the indoor medium. For L = 2, that is, 7,, = 100 ns, N = 255 
and Ry = 32 kb/s, we have BE = 0.116, where b = 87/E(6?). If b = 2, 
meaning that the average, faded interference power is 3 dB less than 
the average, unfaded, signal power, BE = 0.22 as for the AWGN 
channel. Of course, BE grows linearly with b. 

For the case just treated we can do a more exact analysis, as was 
done by Kavehrad in Ref. 1 (see Case 1). Let us assume that the result 
of the selection diversity process is deterministic, not random, whereas 
all the rejected path gains are Rayleigh faded. The method of moments 
can be applied to get the exact solution for the error probability in 
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\ 
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127 255 512 
N-—SPREAD-SPECTRUM SEQUENCE LENGTH 


Fig. 12—Number of simultaneous users versus spread-spectrum sequence length 
when T,, = 100 ns. 


this case. The theory is similar to that given by Kavehrad' and will 
not be repeated here. 

We have summarized the results of all our computations in Fig. 12. 
The results are all for the Rayleigh faded, discrete multipath model. 
Note that for a single antenna, the double-error-correction system will 
support around 40 simultaneous users with a spread-spectrum code 
length of 255. 


7.2 Tm = 250 nanoseconds 


We now consider the larger multipath delay spread reported in Ref. 
25, which was characteristic of the Holmdel building at AT&T Bell 
Laboratories. Now T,, = 250 ns, a Root Mean Square (RMS) value, a 
figure that should be used in a larger building. Now for N = 255, Ro = 
32 kb/s and one gets L = 4 in eq. (8) for the discrete multipath model. 
Other values of L for T,, = 250 ns are given in Table IIIc. Note that 
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Table Via—The number of simultaneous users in terms of sequence 
length, N, and the number of base station antennas, L,. The number 
of users is given in columns 3, 4, and 5. The order of diversity is L,-L, 
Tm = 250 ns, Pp = 107%. 
(7, 4) (15, 7) ie at BE (15,7) BE MRC at 
2 


N L, Code Code Code 2N 
127 1 10 20 9 0.07 0.04 
127 2 30 40 44 0.15 0.17 
255 1 26 36 30 0.07 0.06 
255 2 48 60 120 0.11 _ 
512 1 40 50 — 0.05 — 


Table Vib—A comparison of the estimate of the number 
of simultaneous users with T,, = 250 ns 


Average 
N L, Code Limax K for Linax K 
127 2 (7, 4) 2 30 27 
255 2 (7, 4) 4 48 49 
127 2 (15, 7) 3 40 42 
255 2 (15, 7) 5 60 68 


N = 512 would give L = 8 or 9 for the coded system, and thus a 
diversity order that is too large. In this sense our spread-spectrum 
system has an optimum sequence length or bandwidth. Results similar 
to those in Table IV are given in Table Vla and are also plotted in 
Fig. 13. Note that the (15, 7) coded system can support just under 38 
simultaneous users with the same antenna diversity and code length 
(that is, L, = 1 and N = 255) as used when 7, = 100 ns (see Table 
IV). However, the multipath diversity is now of order 5 (see Table 
IIIc). In any case, the number of simultaneous users is about the same. 
In a simple Time-Division Multiple Access (TDMA) system, one-half 
the number of users would be lost as the maximum multipath delay 
spread has been increased by approximately a factor of two. As such, 
a SSMA system is less sensitive to a change in maximum multipath 
delay spread than a simple TDMA system would be. However, if more 
than 40 simultaneous users are needed, the SSMA system also loses. 
For instance, compare the performance of the double-error-correcting 
system at L, = 2 when N = 512 for T,, = 100 ns (Table IV) and when 
N = 255 for T,, = 250 ns (Table VIb); the loss is from 108 to 60 
simultaneous users. 

As in the case when T,,, = 100 ns we include estimates of K for the 
random model for L in Table VIa. The trend in the results is essentially 
the same as was observed in Tables Va and b. 


7.3 Multipath outage performance estimate 
Up to now we have used the average error probability as the 


1956 TECHNICAL JOURNAL, OCTOBER 1985 


FOR MAXIMAL 
RATIO COMBINING 
' BCH CODE PLUS 
NO oes AT SELECTION 


: / DIVERSITY 


(7, 4) HAMMING 
CODE PLUS 
SELECTION 
DIVERSITY 

/ 


7 
TWO 
ANTENNAS 


(7,4) HAMMING CODE 
—-— PLUS SELECTION 
DIVERSITY 


——-ONE ANTENNA 


K— NUMBER OF SIMULTANEOUS USERS 


Tm = 250 ns 
j SOURCE RATE 32 kb/s 
MAXIMAL RATIO 10 <P, = 2x 104 
COMBINING, 
NO CODING 





127 255 512 
N-SPREAD-SPECTRUM CODE LENGTH 


Fig. 13—Number of simultaneous users versus spread-spectrum sequence length 
when T,, = 250 ns. 


performance measure. Another measure is to find the distribution of 
the error probability. This has been used in other radio studies, for 
example, in Refs. 26 and 27. In the latter study if the probability that 
the average error probability exceeds the value X is, say, 0.10, the 
multipath outage is said to be 10 percent. In keeping with our earlier 
work we take X = 107+. 

We will determine the multipath outage for the reference user path 
gain, 61, with all other path gains having the Rayleigh statistics as 
assumed earlier. The computation of the multipath outage follows 
closely Case 1 of Ref. 1. First 8, is taken to be fixed, and the error 
probability is found by averaging the right-hand side of eq. (11) with 
respect to the sum of the self-plus multiuser interference, using the 
method of moments. Let us call the result of this calculation P.(;), 
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since it depends on the path gain 6,. We then vary £, from zero to fo, 
where P.(6o) = X, the bit error rate parameter. Then 


P{P.(6:) = X} = PO = Bi S 63}. 


The probability P(0 s 6? < 62) is easily obtained by integrating the 
pdf for selection diversity as given in eq. (17). 

If coding is involved, P,(8,) changes. For instance, for X < 107°, 
P.(B1) is well approximated by 9P2(6,), say, for the (7, 4) Hamming 
code. To find 6% one solves the equation X = 9P?2(§) for a given X. 
Of course, for the same X, §o for the coded case is smaller than {> for 
diversity alone, which leads to a lower outage probability. 

The results of our computations for the manner just described are 
shown in Fig. 14. Note that coding is effective in reducing the outage 
probability for increasing » = KL. We found that, to a good approxi- 
mation, the outage probability was only a function of the product, KL. 
This allows us to estimate the number of users for a fixed multipath 
outage as we did earlier for the average error probability. The results 
are shown in Table VIIa for a 10-percent multipath outage and T,, = 
100 ns. Note that the results are given for E,/No = 25 dB. We found 
that for up to 15 moments, outage probabilities for larger E,/No’s were 
not reliable in these computations. No such problem occurred in 
computing the average error probability. The number of simultaneous 
users is about 10 percent less than it would be if the average error 
probability for E,/No = 25 dB were used as a performance measure. 
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Fig. 14—Distribution of error probability. 
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Table Vila—The number of simultaneous users 
for a multipath outage probability of 10 percent 
when T,, = 100 ns, Ex/No = 25 GB, 
2po = —14 dB, and the error probability is 
parameter X = 107* 


K K 
(7, 4) (15, 7) MEC at 
N L, Code Code 2N 
127 2 5 15 0 
127 3 20 36 24 
255 1 5 15 0 
255 2 38 53 53 
255 3 45 68 112 


Table Vilb—The reduction in outage 
probability as L,, the number of antennas, is 
increased. This reduction is relative to L, = 2, 
where K = 38 for the (7, 4) code and K = 53 

for both the (15, 7) code and the MRC 


system. 
N L, System P, 
255 3 Coded 3.5 X 10°? 
255 4 Coded 1.1 x 107? 
512 3 MRC ce 
512 4 MRC 4.7 xX 107* 


We note that the Gaussian assumption had the same accuracy as we 
stated earlier for the average error probability. Accordingly, the MRC 
results were computed by invoking a Gaussian assumption on the 
interference. Table VIIb shows results on how the multipath outage 
can be reduced by increasing antenna diversity. For a large order of 
diversity, MRC is stronger than selection diversity plus coding in this 
task. 

Our work to date has assumed an average power control so that all 
user signals arrive at the central station with the same average power. 
We now let the static attenuation of the reference user be higher than 
each member for the whole multiuser population. The results for a 10- 
percent multipath outage are shown in Table VIII. Note that the loss 
in user population in percent is about the same as the static power 
loss of the reference user. Thus, SSMA is not sensitive to small 
deviations in static power control. 


VII. CONCLUSION 


Our main conclusion is that direct-sequence, spread-spectrum mod- 
ulation can give a quite respectable number of simultaneous users for 
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Table Vill—The percentage reduction in the number 
of simultaneous users in Table VII as the reference 
user is subjected to increased attenuation. This is for 
Tm = 100 ns, X = 107*, and the multipath outage 
probability of 10 percent. 


6K (7, 4) Code 65K (15, 7) Code 


6Yo dB dyo Percent Percent Percent 
—0.5 12 21 15 
—-1.0 26 36 28 
—2.0 58 58 47 


communication over fading, multipath channels. This is for commu- 
nication from transportable stations to a base station in a star network 
using average power control that depends only on the power law 
exponent and static shadow fading. The inherent diversity of spread- 
spectrum modulation can be combined with antenna diversity using 
the simple selection diversity rule to give efficiencies that are compa- 
rable to those attained with maximal ratio combining. Fairly high 
orders of diversity are needed for the latter to be better than selection 
diversity used with channel coding. 

We also have found that a Gaussian assumption regarding the 
multiuser interference can lead to a maximum error of 20 percent in 
predicting the number of simultaneous users for a noise floor average 
error probability of 107‘. This is for a system using selection diversity. 

We conclude that spread-spectrum modulation can be less sensitive 
to a change in maximum multipath delay spread than, say, time- 
division multiple access would be. Thus the same spread-spectrum 
modem could possibly be used in either large or small buildings for 
indoor radio communication. 

The main assumptions used in the paper are that demodulation 
errors are independent and that the multipath model is discrete. The 
former can be overcome with interleaving. However, the latter must 
be verified experimentally for indoor, wireless, local area network 
application. 

We have also considered multipath outage as a performance crite- 
rion. For an outage of 10 percent we find that the number of simulta- 
neous users is reduced by approximately 10 percent over that predicted 
by using average error probability as a performance measure. 

Although our analysis was for selection diversity or maximal ratio 
combining and coherent binary phase-shift keying, in practice we 
would suggest equal gain combining and differential binary phase- 
shift-keying modulation. We make this suggestion as usually equal 
gain combining falls in performance between that for selection diver- 
sity and maximal ratio combining. Also, differential phase-shift keying 
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is less troublesome to demodulate in a fading environment such as 
occurs in IWC. 


IX. ACKNOWLEDGMENT 


Discussions with David Goodman and L. J. Greenstein have been 
helpful in our work. Also, Carl-Erik Sundberg is acknowledged for 
contributing some of the ideas used in Appendix A. 


REFERENCES 


1. M. Kavehrad, “Performance of Nondiversity Receivers for Spread Spectrum in 
Indoor Wireless Communications,” AT&T Tech. J., 64, No. 6, Part 1 (July- 
August 1985), pp. 1181-210. 

. G. H. Golub and J. H. Welsh, “Calculations of Gauss Quadrature Rules,” Math. 
Comput. J., 26 (April 1969), pp. 221-30. 

. Wm. C. Jakes, Jr., Microwave Mobile Communications, New York: Wiley, 1974. 

. L. B. Milstein, R. L. Pickholtz, and D. L. Schilling, “Optimization of the Processing 
Gain of an FSK-FH System,” IEEE Trans. Commun., COM-28 (July 1980), pp. 
1062-79. 

. M. K. Simon et al., Spread Spectrum Communications, 1, Rockville, Maryland: 
Computer Science Press, 1984. 

. A. Livine, “Design Considerations for Code Division Multiple Access in Voice/Data 
Radio Network,” Proc. 1984 Military Comm. Conf., October 21-24, Los Angeles, 
California, pp. 37.2.1-.6. 

7. G. Turin, “The Effects of Multipath and Fading on the Performance of Direct- 
Sequence CDMA Systems,” IEEE J. Selected Topics Commun., SAC-2 (August 
1984), pp. 597-603. 

. P. Freret et al., “Applications of Spread-Spectrum Radio to Wireless Terminal 
Communications,” Proc. NTC’80 (June 1980), pp. 69.7.1-.4. 

9. S. Nanayakkara and J. B. Anderson, “High Speed Receiver Designs Based on 
Surface Acoustic Wave Devices,” Satellite Commun., 2, No. 2 (April 1984), pp. 
121-8. 

10. J. G. Proakis, Digital Communications, New York: McGraw-Hill, 1983. 

11. P. S. Henry and B. S. Glance, “A New Approach to High Capacity Digital Mobile 
Radio,” B.S.T.J., 60, No. 8 (October 1981), pp. 1891-904. 

12. G. Turin, “Introduction to Spread-Spectrum Anti-Multipath Techniques and Their 
Application to Urban Digital Radio,” Proc. IEEE, 68 (March 1980), pp. 328-53. 

13. M. B. Pursley, “Spread-Spectrum Multiple-Access Communication,” CISM Course 
and Lectures No. 265, G. Longo, ed., New York: Springer-Verlag, 1981. 

14. J. S. Lehnert and M. B. Pursley, “Multipath Diversity Reception of Coherent 
Direct-Sequence Spread-Spectrum Communications,” Proc. 1983 Conf. Inform. 
Sci. Syst., The Johns Hopkins University, March 23-25, 1983. 

15. A. Papoulis, Probability, Random Variables, and Stochastic Processes, Second Edi- 
tion, New York: McGraw-Hill, 1984. 

16. M. B. Pursley, “Performance Evaluation for Phase Coded Spread-Spectrum Multi- 
ple Access Communication—Part II: Code Sequence Analyses,” IEEE Trans. 
Commun., COM-25 (August 1977), pp. 800-3. 

17. C. E. Sundberg, “Error Probability of Partial Response Continuous-Phase Modu- 
lation With Coherent MSK-Type Receiver, Diversity, and Slow Rayleigh Fading 
in Gaussian Noise,” B.S.T.J., 61, No. 8 (October 1982), pp. 1933-63. 

18. V. Pless, Introduction to the Theory of Error-Correcting Codes, New York: Wiley- 
InterScience, 1982. 

19. S. Lin and D. J. Costello, Jr., Error Control Coding: Fundamentals and Applications, 
Englewood Cliffs, N.J.: Prentice-Hall, 1983. 

20. H. F. A. Roefs and M. B. Pursley, “Correlation Parameters of Random Sequences 
and Maximal Length Sequences for Spread-Spectrum Multiple-Access Commu- 
nications,” IEEE Trans. Commun., COM-27 (October 1979), pp. 1597-604. 

21. A. A. M. Saleh and R. Valenzuela, private communication. 

22. R. Valenzuela, private communication. 

23. W. C. Wong, O. G. Jaffee, and D. J. Goodman, private communication. 

24. Further Notice of Inquiry and Notice of Proposed Rulemaking, “Authorization of 
Spread-Spectrum and Other Wideband Emissions Not Presently Provided for in 
the FCC Rules and Regulations,” May 21, 1984. 


mo bb 


ao oo 


oo 


INDOOR, WIRELESS COMMUNICATION — 1961 


25. D. M. J. Devasirvatham, “Time Delay Spread Measurements of Wideband Radio 
Signals Within a Building,” Electron. Lett., 20, No. 23 (November 1984), pp. 950- 


26. G. J. Foschini and J. Salz, “Digital Communications Over Fading Radio Channels,” 
B.S.T.J., 62, No. 2 (February 1983), pp. 429-56. 

27. B. Glance and L. J. Greenstein, “Frequency- Selective Fading Effects in Digital 
Mobile Radio With Diversity Combining,” IEEE Trans. Commun., COM-31 
(September 1983), pp. 1085-94. 

28. G. C. Clark, Jr., and J. B. Cain, Error-Correction Coding for Digital Communications, 
New York: Plenum Press, 1981. 


APPENDIX A 
Channel Coding Formulations 


In this Appendix we derive the formula for the bit error probability 
used in the paper. Independent channel errors are assumed and only 
single-error-correcting, (7, 4) Hamming code and double-error-cor- 
recting, (15, 7) BCH codes are considered. 


A.1 The (7, 4) Hamming code 


We first determine the bit error probability for a small channel error 
probability p. = 1 — q.. The first term in the power. series in p, for Po: 
will be denoted as Pj;. A well-known approximation [see eq. (1-27) of 


Ref. 28] gives Pj, = dP,/n, where P, = : p2q2. Here P, is the 


probability, for small p,, that a code vector is in error and d is the 
ae Hamming distance. As n = 7 and d = 3 we have Pj, = 

9p2q2 for small p, for the first term in eq. (30). 

The weight of a code word is the number of its nonzero symbols. 
Let A; be the number of code words of weight i. For the (7, 4) Hamming 
code Ap = A7 = 1 and A3 = A, = 7 with all other A; = 0. Since the 
(7, 4) Hamming code is a perfect code, two channel errors always 
produce a weight-3 code word where we assume the all-zero code word 
is transmitted. As the code is linear we have no loss in generality. In 
the weight-3 code words, three code words contain one bit error, three 
contain two bit errors, and one contains three bit errors, where a bit 
error refers only to erroneous information bits. Thus, as there are four 


information bits, 
i) 
2 53.6 3 
1 _ Pe lortaer Depa Nets 


which gives 9p2q?, as before. 

We now find a correction term to P},, which we call Pi. When 
three channel errors result, either a weight-3 or a weight-4 code word 
is decoded. Now, Pi; is the sum of the probabilities of these two cases. 
The weight-4 code words can be chosen in 28 ways, since each of the 
seven weight-4 code words has four correctable error patterns. Thus, 
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2 7 = 28 3,4 9 6 1 pone 3,4 
Pin (weight-4) = 7 Pede | regres 1] = l6peqe, 
as among the weight-4 code words three have three bit errors, three 
have two bit errors, and one code word contains a single bit error. 
To consider the second component of Pi, we note that a weight-3 
code word is chosen when the channel errors combine with the trans- 
mitted all-zero code word to produce a weight-3 code word. Thus, 


3 6 3 
2 : 2) = 2G. a —-+— 
P;, (weight-3) = peq { 7 + r a 


to give 3p2qi. Adding the results for P?,, we get the correction term 
in eq. (30) of the paper. We get exactly the same result by applying 
the approximation dP,/n to the weight-3 and weight-4 error events. 
That is, for the weight-3 bit error probability we have 7 x 3p2qi/7 
and for weight-4, 28 x 4p2q4/7 to get P?; = 19p2q?. 


A.2 The (15, 7) BCH code 
We generated the (15, 7) BCH code with generator polynomial 


g(x) =Ltxtt xo tx? + x® 


This gave Ao = 1, As = 18, and Ag = 28, which are the only components 
of the weight distribution we will need. Now As; has five code words 
with one bit error, five with two, six with three, and two with four. 
Thus, 


455 5 10 18 8 


2470 
Pi2(weight-5) = —— peqe” C +T+—>+ I Le ae 


18 a a 1g ot"? 


since there are (:?) = 455 error patterns with three channel errors. 
Also, 


455 , » f2, 16, 39, 16. 5| 5070 , , 
98 Pee; 


1 ; im = — pide —- + — + — —+—- 
Pi2(weight-6) 98 Deq 7+ 7 + 777 


since in Ag two code words have one bit error, eight have two, thirteen 
have three, four have four, and one code word has one bit error. 
To combine our two values of Pi2, we use the density of the code 


words. For each of the 128 code words there are a = 105 correctable 


double-error patterns and 15 correctable single-error patterns. Thus 
the number of channel outputs within a distance two of code words is 
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121 X 128. There are 2% channel outputs in all, to give a probability 
of 0.47 of falling within a decoding sphere of radius two. We assume 
47 percent of the space around the zero code word is filled with weight- 
5 code words. In the remaining 53 percent we assume weight-5 and 
weight-6 code words are chosen with equal probability. Thus, 


Pio = Pio(weight-5) X 0.735 + Pi, (weight-6) x 0.265 = 150p2q2?, 


which agrees well with Pip = P,d/15, as now d = 5 and P, = 


15 3,12 
3 Peqe : 


To get the correction term we assume that the four channel errors 
produce either a weight-5 code word or a weight-6 code word. The 
weight-5 codes have five correctable error patterns of weight-4 per 
code word and thus 18 X 5 = 90 weight-4 channel outputs in their 


decoding spheres. For weight-6 code words we have 3 = 15 cor- 


rectable error patterns and thus 28 X 15 = 420 channel outputs in 


weight-6 decoding spheres. This leaves "4 


and we assume they are equally divided between weight-5 and weight- 
6 code words. Averaging over the bit errors in weight-5 and weight-6 
codes gives 


— 510 = 855 error patterns 


Pip = (RAH oe er Pa 


eYye = 06 eYye 9» 

igx7 1 9x7 ) Pede = O06Peq 

where the first term is for weight-5 code words. Use of the dP,/n 
approximation gives 


518 x5 848 x6 
Cos 411 .. 411 
62 = ( 15 + 18 ) ta 512peqge ’ 





which is in good agreement with the result just given above. 
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Nonlinear Input-Output Maps and Approximate 
Representations* 


By |. W. SANDBERG? 
(Manuscript received May 20, 1985) 


An approximation theorem is given for causal time-invariant nonlinear 
maps that take one set of functions defined on [0, ©) into another. The 
theorem is used to show that, under some typically very reasonable conditions, 
an input-output map can be approximated arbitrarily well in a meaningful 
sense by a finite Volterra series, even though it may not have a Volterra series 
expansion. The set of inputs on which the approximation holds need not be 
compact, and the inputs need not be continuous. 


I. INTRODUCTION 


In this paper an approximation theorem is given for causal time- 
invariant nonlinear maps that take one set of functions defined on [0, 
co) into another. The theorem is used to show that, under some 
typically very reasonable conditions, an input-output map can be 
approximated arbitrarily well in a meaningful sense by a finite Volterra 
series, even though it may not have a Volterra series expansion. The 
set of inputs on which the approximation holds need not be compact. 
A more detailed introduction follows. 


1.1 Background 


Researchers have long been interested in a variety of questions 
concerning the mathematical representation of systems that need not 
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be linear. In recent years much has been learned about the existence, 
determination, and properties of power-series-like expansions for ex- 
pressing a system’s outputs in terms of its inputs (see, for instance, 
Refs. 1 through 7). An important example of the form of such an 
expansion is 


w(t) = af ae f Rg(t1, +++, Tqg)u(t — 73) 


+++ U(t — Tq)dr1 +++ dq, t20, (1) 


in which w is the output, u is the input, the kernels k, are functions 
determined by the system, and u is drawn from a set of bounded real- 
valued inputs such that the right side of (1) converges uniformly in t 
for every input. While certain more general versions of (1), where the 
k, are symbolic functions and vector-valued inputs are taken into 
account, are frequently needed, in all cases one has 


w=3K,(u), we, (2) 


in which U is a set of inputs and each K, is a homogeneous map of 
degree g. Under weak assumptions these K, are uniquely determined, 
and as such are very special associates of the system represented by 
(2). 

The right side of (1) is an example of what is often called a Volterra 
series. Actually, Volterra considered not (1) but related expansions in 
which the integration limits are constants and the u(t — 7;) are 
replaced with u(z7;). These expansions were used by Volterra as a 
model of a nonlinear functional in his path-breaking studies of oper- 
ations on functionals. A comprehensive account of this work is given 
in Ref. 8, where attention is directed to a representation result (Ref. 
8, page 20) due to Fréchet that concerns, in particular, the uniform 
approximation of continuous functionals on compact sets, using a 
finite number of terms in Volterra’s series. Volterra also mentions the 
analogy between this aspect of Fréchet’s result and the Weierstrass 
approximation theorem for continuous real functions on a real com- 
pact interval. 

While Fréchet’s Weierstrass-like result is certainly interesting and 
important, it does have significant limitations with regard to the 
representation of input-output maps: (1) it concerns approximations 
rather than expansions in the usual sense, (2) these approximations 
are on compact sets, (8) it directly concerns functionals rather than 
mappings from one function space to another, etc. These limitations, 
as well as those of the more general representation result of Fréchet 
described in Ref. 8, p. 20, do not appear to have been always appreci- 
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ated by early writers concerned with applications of Volterra series, 
who sometimes cited Ref. 8 as though it contained a justification for 
the use of relations of the form (1). 

There can be very basic differences between an arbitrarily good 
approximation and an expansion, and, more to the point, between 
knowing that one, as opposed to the other, exists. For example, an 
approximation known to exist may not have properties that facilitate 
its determination. Nevertheless, existence results concerning approx- 
imations can sometimes be useful, especially when an expansion does 
not exist. Thus, it is clearly of interest to consider the extent to which 
input-output maps of systems can be approximated in some meaning- 
ful sense by a finite sum, 


Q t t 
yk che | eee jueaes) 


- u(t — Tq)d71 +++ dra; t2=0, (8) 


of terms of the form that appear on the right side of (1), or by a finite 
sum of suitably more general terms if u is vector valued. Of course, 
approximations involving larger classes of finite sums of iterated 
integrals can be of interest too. 

Related questions have in fact been considered for many years,” » 
and as one might expect, the main mathematical tool that has been 
used is the Stone-Weierstrass theorem. In this earlier work the input 
signals considered are assumed to belong to a Hilbert space (e.g., an 
Lz space) and/or to be defined on only a finite interval, and the inputs 
are taken to belong to a compact set. In contrast, in Ref. 16 an 
approximation result is given for input-output maps that act between 
certain subsets of the Banach space C(IR) of bounded, continuous, 
real-valued functions defined on the doubly infinite interval (—0o, 0), 
with the usual norm. The maps considered there are assumed to be 
time invariant and to have a “fading-memory” property that enables 
one to prove that a certain set of functions defined on (—%, 0] is 
compact. The extent to which the results in Ref. 16 bear on the main 
problem considered in this paper, where the input and output signals 
are defined on [0, ©), is not discussed in Ref. 16. Although the results 
in this paper are considerably different from those in Ref. 16, there 
are some similarities: an approximately-finite memory hypothesis 
(related to hypotheses in Ref. 17, Section 2.2) plays a central role, and 
we too depend on a form of the Stone-Weierstrass theorem. On the 
other hand, the compact sets with which we deal are always sets of 
functions defined on a finite interval [0, w]. 


1.2 Outline of this paper’s results 


In Section II, attention is focussed on a class of causal time-invariant 


MAPS 1969 


maps G that take S into So, where S and Sp are sets of signals (i.e., 
sets of functions) defined on [0, ©), and the elements of So are real 
valued. The maps G are assumed to possess a factorization FH, where 
H takes S into S;, and F maps S, into So, where 8S; is a third set of 
signals on [0, ©). Certain hypotheses on S, So, Si, and the factors F 
and H are introduced in Theorem 1 of Section 2.2. Under those 
conditions, the theorem shows that given any « > 0, there are a 
constant A = 0, and a map P having an important special form (that 
involves a real polynomial p in several variables together with a certain 
“fundamental set ” of maps) such that the approximation 


| G(u)(t) — (PH)(Umaxjo,t-a))(t) | < ¢, t>0 


holds for all u € S, where u, for arbitrary nonnegative w is defined by 
u(t) = 0 if 0 <t <w, and u,(t) = u(t) for t = w. One of the main 
hypotheses used is that the memory of G is “approximately finite,” to 
the extent that for any 59 > 0 there is a 6 > 0 for which 


| Gu(t) — GUmax{o,t—3}(t) | < do 


for t => 0 andu€S. This hypothesis can be shown to be satisfied in 
many cases of interest. More will be said about this later. 

The hypotheses of Theorem 1 are of an abstract nature and so is its 
conclusion. The theorem is used as a “tool theorem” in the proof of 
Theorem 2 in Section 2.4 which addresses a case that is of direct 
interest in applications. In Theorem 2, the memory of G is assumed 
to be approximately finite, S is taken to be a set of uniformly bounded 
vector-valued functions on [0, ©), S; is a similar set, H is a convolution, 
and F is causal, time invariant, and continuous in a certain typically 
reasonable sense (see the theorem for additional details). Assume now 
for the sake of simplifying the discussion that the elements of S are 
scalar valued. According to the theorem, under the conditions stated 
there, Gu can be uniformly approximated arbitrarily well on S by a 
finite sum of the form (3). Again, the reader is referred to the theorem 
for the details. Related results concerning discrete-time cases and 
composites of maps that have approximately-finite memory are given 
in Sections 2.5 and 2.3, respectively. . | 

The case considered in Theorem 2 arises often. This is discussed in 
Appendix B, where an important class of input-output maps is ad- 
dressed, and where a technique for showing that the memory of a 
nonlinear map is approximately finite is illustrated. 


Il. INPUT-OUTPUT MAPS AND APPROXIMATIONS 
2.1 Preliminaries 


Throughout Section II, V is a linear space, 2 denotes the interval 
[0, 0), t and w are elements of 2, S and Sp are two sets of functions 
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on 2, the elements of S and Sp take values in V and (—, ©), 
respectively, and G is a map from S to So. 

We use S, to denote a third set of functions on Q; these take values 
in a normed linear space Vj. It is assumed that 


G = FH, 


where H maps S into S; and F takes S, into So. 

The set S, which is our set of inputs, is assumed to have the following 
properties: 

(i) uE S => (u),, € S for each w, where (u),,(t) = u(t) for t < a, 
and (u),,(t) = 0 (here the zero element of V) otherwise. 

(ii) vE S => (T,u) ES for w £0, in which 7, is defined by (T.,u)(t) 
= 0 (0 <t<w) and (T.u)(t) = u(t — w) fort =o. 

(iii) uE S => u, € S for w ¥ 0, where u.,, is given by u,,(t) = 0 for 0 
<t<w and u,(t) = u(t) for t = w. [Note the distinction between u, 
and (u),, of Property (i).] 

(iv) o #0 andu€E S with u(t) = 0 (0 <t<w) =>vES, where v(t) 
=u(t+w),t20. 
Notice that (i)-(iv) simply require that S be closed under certain 
elementary operations. With regard to S,; we assume that 

(v) Properties (ii) and (iii) hold with S replaced with Sj. 

We use the standard definitions of causality and time invariance. 
That is, amap M from S to So, from S to S;, or from S; to So is causal 
if uw; and uz in the domain of M with u;(t) = u.(t)(0 < t < w) always 
implies that (Mu;)(t) = (Muz)(t) for 0 < t S w; M is time invariant if 
w ~ 0 and u in the domain of M => (MT,,u)(t) = 0 for 0 <t <w and 
(MT.,u)(t) = (Mu)(t — w), t = w. Also, by M € e(S) or M € Y(S}) 
we mean that the domain of M is S or S;, respectively, and that M 
has “approximately-finite memory” in the sense that for each constant 
59 > 0 there is a positive 6 € 0-such that 


| (Mu)(t) — (Mumaxio,t-31)(t) | < 40, t20 


for all u in the domain of M. Here | -| denotes simply the absolute 
value if the range of M is So; it denotes the norm in V, otherwise. 

The set of functions x defined on [0, w] with values in V, such that 
x(t) = y(t) (0 <t < w) for some y E 8, is denoted by (S;)., for each w. 
We use H(S),, to stand for the set of functions x in (S;),, such that 
x(t) = (Hu)(t) (0 < t S w) for some u E€ S. It is assumed in Theorem 
1 (below) that the H(S),, have the following property: 

(vi) There is a family of metric spaces {(X., p..):w > 0} such that 
for each w > 0 we have H(S),, C X., C (S;), and X,, is compact (ie., 
compact in itself) with respect to the metric p,. 

For w ~ 0 and each causal map M from S; to So, M,, denotes the 
functional on (S;,),, defined by 
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M.x = (My)(w), Le (Si).; (4) 


where y € S; satisfies x(t) = y(t), 0 <t < w. The following is assumed 
with regard to F in Theorem 1 (below): 

(vii) F is causal and for each w # 0, F,, is continuous on X,, with 
respect to p, [where X,, and p,, are described in (vi)]. 

Finally, by a fundamental set {Fz:a € A} of maps from S, to So 
relative to (vi), we mean that the F, are causal and time invariant, 
and that for w ¥ 0 the corresponding family {Fy..:a € A} of functionals 
on (S,). is continuous on X,, with respect to p,, and separates the 
points of X,,. (By “separates the points of X,,” is meant Ref. 18, p. 
41 that for each pair of distinct elements x, and x2 of X,, there is an 
a € A such that Fyox%1 A FowXe.) 


2.2 The main approximation result 


In this section we prove the following: 
Theorem 1: Let (1)-(vit) be met, with F and H causal and time invar- 
lant, and with G € 2/(S). Suppose that there is a fundamental set 
{F.:a € A} of maps from S, to So relative to (vi). Then for each « > 0 
there area A € Q, a positive integer k, elements Fu,, --- » Fa, of {Faia 
€ A}, and a real polynomial p in k real variables with p(0, --- , 0) =0 
such that 


| G(u)(t) — (PH) (Umaxos-a(t)| <6 £20 (5) 


for every u € S, where P is the map from S, into So given by.(Py)(t) = 
P[Faly)(t), «++, Faly)(t)], ¢ = 0 for y € S. In addition, the map 
Q:S — So defined by (PH)(Umaxiot—a})(t) = (Qu)(t) for u € S and t=0 
is causal and time invariant. 


2.2.1 Proof of Theorem 1 
Proof: Given ¢, choose a positive A € Q so that 
| (Gu)(t) — G(Umaxio,t—-a))(t)| < «/2, t20 (6) 


for u € S. Observe that S contains an element 6 such that @(t) = 0 for 
t = 0. By the time-invariance of H, H(S) also contains such an 
element, and therefore there is e € X, such that e(t) = 0 for t € 
[0, A]. By the causality and time invariance of F, we have Fy,e = 0. 
Using a version of the Stone-Weierstrass theorem (see Ref. 18, p. 46), 
Condition (vii), and the hypothesis that there is a fundamental set 
{F.:a € A} of maps from S, to So relative to (vi),* there are a positive 
integer k, a polynomial p as described, and elements F,,, --- , Fu, of 
the fundamental set such that 
| Fax — Pgx| < &/2, xE Xa, 


* Tt was necessary to establish that Fxe = 0 because F.,e = 0 for all a € A (see Ref. 
18, Theorem 5). 
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where the functional P, is the associate [see (4)] of P described in the 
statement of the theorem. Thus, using H(S), C X,, we have 


| (FHu)(A) — (PHu)(A)| < €/2, UES. (7) 


Now let any u ES andt € © be given. Suppose first that t > A. Let 
v be defined by v(r) = ug_a[t + (t — A)], 7 = O [notation of Condition 
(iii)]; here we have used Conditions (iii) and (iv). By the time invari- 
ance of G = FH and of PH in (7), one has G[ug_a)](t) = G(v)(A) and 
(PH)[Ucu-a)](t) = (PH)(v)(A). Thus, by (7), 


| G[ug-ay](t) — (PH) [ue-ay l(t) | < €/2. 
Using this and (6), 
| G(u)(t) — (PH) [Umaxioe-aj](t) | < | G(u)(t) — G(Umaxto,e-a)) (2) | 
+ | G(Umaxio,t—a}) (¢) = (PH)(Umaxio,t—a})(t) | < é/2 - «/2 = € 


Suppose now that t < A. Then by Conditions (i) and (ii) and the 
causality and time invariance of G, we see that G(u)(t) = G[(u),](t) = 
G[Ts-»(u).](A) [notation of Conditions (i) and (ii)], and similarly 
(PH)(u)(t) = (PH)[Tia-»(u):](A). Thus, using (7), 


| (Gu)(t) — (PH)(u)(t) | < ¢/2. (8) 
Since (8) holds also for t = A, and obviously u = Umaxyo,-aj When 
t <= A, we have (5). At this point it suffices to prove the following: 


Lemma: Let K be a causal time-invariant map of S into So, and with 
A EQ, let M be the map from S to So given by 


(Mu)(t) = K(Umaxjot-a)(t),  ¢ 20 
for u€ S. Then M ts causal and time invariant. 
Proof: Let u, and uz in S satisfy u,(r) = up(r) for 0 < 7 <w, and let 
t E [0, w]. Since (Uamaxto,t—a})t = (Uomax{o,t—a})¢ and K is causal, it is clear 
that (Mu,)(t) = K[(Uamax(o,t—a)¢] (t) oad K[(uomaxio,t—ay)e(t) as (Mu,)(t), 
showing that M is causal. 

Now let u € S and let w in Q be nonzero. For t < w, (MT,,u)(t) = 
K[{ Tu} maxfo-a\l(t) = 0, because K is time ‘ invariant and 
{ T.,U}max(o,t-a}(7) = 0 for 7 < w. Suppose that t = w. Since (as can easily 
be verified) {T.uU}maxtot—a) = Tullmax{ot—-w-a], One has (MT,u)(t) = 
K( T..Umax{0,t-w—a}) (t) = K (Umax{o,t-w-a)) (t — w) = (Mu)(t - w), by the 
time invariance of K. Thus M is time invariant. This completes the 
proof of the lemma and of the theorem. 


2.3 Comments 


All of the material in Sections 2.1 and 2.2 remains valid if 2 is 
replaced throughout with {0, 1, ---} (with the understanding that 
then [0, w] means {0, --- , w}). 
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The following result concerning the hypothesis that G € </(S) is 
frequently useful. By “F € Lip(H(S)]” below is meant that there is a 
positive constant c such that 


| (Fyi)(t) — (Fy2)(t)| < oup | yil(t) — yo(r) | 


for t > 0 and y, and y2 in H(S), where | -] on the right side denotes 
the norm in V;. 
Proposition: Let (i)-(v) be met,* with H € /(S), F € <(S;), and 
F € Lip[H(S)]. Then G € 2(S). 

The proposition is proved in Appendix A. For an application, see 
Appendix B. 


2.4 Approximations and finite Volterra series 


In the following theorem, n and p are arbitrary positive integers and 
L.(n) and L..(p), respectively, denote the normed linear spaces of real 
n-vector valued and real p-vector valued Lebesgue measurable func- 
tions u defined on Q such that ||u|| 4 sup, | u(t) | < ©, in which | u(t) | 
= max;|u;(t)| and u;(t) stands for the jth component of u(t). By a 
“vector,” we mean a column vector. Also, for any positive integer q 


and any q n-vectors ai, «++ , dg, we use x[a1, ---, @,] to denote the 
vector of order n’ whose elements are the n’ distinct products (a), 
-++ (dq),,, corresponding to distinct sequences \j, --- , Aq with each ); 


drawn from {1, --- , n}, arranged in an arbitrary predetermined order 
that depends only on q and n. Of course, x[a1, --+ , Gg] is simply the 
product a, --- a, if n = 1. Finally, we use K(q, s) (q, s positive; q an 
integer) to denote the set of all functions k from [0, ©)? to the 1 X n? 
matrices such that 


Rj 4q 
Rj(v1, +++, Tq) = 2 II djir(Ti), (9) 
for all 71, +++, Tq and allj € {1, --- , n%}, where Rj < © and the ¢;;, are 


real valued and continuous on [0, s] and vanish on (s, ©). [Notice that 
K(q, s) is simply a set of row-matrix-valued functions whose elements 
have a certain nice finite sum of products representation. ] 


Theorem 2: Let GE (S), with S = {u € L..(n):||u|] < B}, where B is 
a positive constant. Assume that H is defined by 


(Hu)(t) = f h(t — r)u(r)dz, t2=0 


for u € S, where h is a real p X n matrix-valued function on Q such 
that each h,; is (Lebesgue) integrable on Q. Take S; to be {y € L..(p):||y || 


* For the sake of ease of exposition, (i)-(v) are assumed to be satisfied. However, 
only (iii) and (iii) with S replaced with S, are used. 
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< f,}, in which B, is any number that satisfies sup,| (Hu)(t)| < i for 
u€S.* Assume also that F is causal and time invariant on S;, and that 
F satisfies the continuity condition that given a continuous y in S,, and 
numbers t € (0, ©) and 6, > 0, there is a 6g > 0 such that 


| (Fy)(t) — (Fz)() | < 61 


whenever z € Sj, z is continuous, and max,eo,4| y(t) — z(7)| < be. 
Under these conditions, given any « > 0, there is a positive integer Q, 
an s € (0, ©), and elements k, of K(q, s) (1 <q < Q) such that 


|(Gu)(t) — (Vu)(t)|<«6 t20 (10) 
for all u € S, where 


Q t t 
(Vu)(t) = yi f ss f k(n, aa | T)x[u(t oe 71); 


see, u(t — Tq)\d71 +++ dq. (11) 


2.4.1 Proof of Theorem 2 


Proof: We use Theorem 1. Conditions (i)—(v) are met with F and H 
causal and time invariant, and with G € °</(S). 

Consider Condition (vi). Let w be any positive number. For x € 
H(S)., 


x(t) = at h(t — r)u(z)dr, t € [0, «) 


for some u € S. In particular, 


sup |x(t)| < fi, (12) 
tE[0,w] 


and there is a function from (—, 0) to [0, ©), which depends only 
on h, such that \(a) > 0 as a > 0, and 


[x(t + a) — x(t)| < BX(a) (13) 


for t and (t + a) in [0, w]. The existence of such a X follows directly 
from the result (see Ref. 19, p. 14) that 


{ |r(t + a) — r(t)|dt ~0 as a—0 
when r is integrable on (—~%, 0). Now let {X.,, p..} be the metric space 


of all functions x from [0, w] to the real p-vectors such that (12) as 
well as (18) are satisfied, and 


* Our conditions here on S and S, are clearly consistent with (i)-(v) of Section 2.1. 
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Po(%1, X2) = max] x(t) — x2(t) |. 
tE[0,w] ji 


This space is easily seen to be closed. For p = 1 its elements are 
uniformly bounded and equicontinuous. Thus, the space is compact 
for p = 1. Using this fact and, for example, the proposition that 
compactness in a metric space is equivalent to sequential compactness, 
it follows that the space is in fact compact for any positive integer p. 
Clearly, H(S),, C X., C (S;)., which shows that (vi) holds. 

By the continuity condition on F in Theorem 2, (vii) is satisfied. 

Now let {F,:a € A} be the set of all maps M defined on S, having 
the representation 


(My)(t) = f m(t — r)y(r)drz, t>=0 (14) 


where m is a real (1 X p) matrix-valued function on [0, ©) such that 
for any component m; of m there is a real o > 0 for which m; is 
continuous on [0, o] and vanishes on (oc, ©). Let w > 0, and observe 
that the F,, are continuous on X,, with respect to p,,. 

To see that they separate points of X.,, let x, and x2 be distinct 
points of X,,. Let i € {1, --- , p} be such that x,;(t) — xo;(t) # 0 on 
some subinterval of [0, w]. Let a € A be such that (F.y)(t) is given 
by the right side of (14) with m,(t) = [xii(w — t) — x2i(w — t)] for tE 
[0, w] and m;,(t) = 0 otherwise, and with m; vanishing on [0, ©) for j # 
i. Then 


Prwo% = de = { [x1:(7) x2i(7)|?dr > 0. 
0 


Thus {F,:« € A} is a fundamental set in the sense of Theorem 1, 
and by Theorem 1 given « > 0 there are A, k, Fy, +--+, Fo,y Ds 
and P as described there such that (10) holds with (Vu)(t) = 
(PH )(Umaxjo,e—a)) (t). 

For t = 0, we have 


(PH) (Umaxo,e—a}) (t) 
= p[(Fa,H)Umaxios-ay(t), +++ 5 (Fa) Umaxto-aj(t)]. (15) 


It is not difficult to verify that for any j € {1, --- , k} the operator 
(FH ) is equivalent to a convolution C whose 1 X n matrix-valued 
kernel c has elements that are continuous and integrable on 0. 

Also, one finds that 


t t 
{ c(t — T)Umaxto,t-a)(7)d7 = { b(t — r)u(r)dr, t 20, 
0 0 
where b(t) = c(t) for t € [0, A] and b(t) equals the 1 X n zero matrix 
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otherwise. This together with (15), and just the observation that 
products of integrals can be written as iterated integrals, shows that 
V is as described in Theorem 2. 


2.5 Comments 

Cases in which the conditions of Theorem 2 are met arise often in 
applications. This is illustrated in Appendix B where the theorem is 
used to show that an important large class of input-output maps have 
finite Volterra series approximations. 

In the proof of Theorem 2, the F,, are taken to be linear operators. 
It is clear that related additional approximation theorems can be 
obtained by allowing the F, to be nonlinear. 

Equation (15) in the proof of Theorem 2 shows that G in the theorem 
can be approximated arbitrarily well by a linear dynamic subsystem 
followed by a memoryless nonlinear subsystem with “polynomial non- 
linearities.” The existence of approximate system representations 
involving linear subsystems with an additional (and constant) input 
and only nonlinearities that take absolute values can be proved using 
Theorem 3 of Ref. 18. 

The proof of Theorem 2 can easily be modified to establish a 

corresponding result for the discrete-time case in which Q is replaced 
with 0, 4 {0, 1, --- }. In fact, for that case the proof simplifies in an 
important conceptual way because then for any positive integer w, 
(S,).. is compact with respect to the usual discrete-time analog of p,, 
in the proof of the theorem. In particular, in the discrete-time case we 
can set n = p, set S = S,, and take H to be the identity map from S 
onto itself. This leads to the following theorem in which 4(n) is L..(n) 
with Q replaced with Q,, and k(q) stands for the collection of all 
functions k from 03 to the 1 X n? matrices such that (11) holds for all 
T1, +++, Tq and all j where R; < © and the ¢ji (7;) are real and are 
nonzero for at most a finite number of values of 7;. 
Theorem 3: Let U = {u € &(n):||ul| < B} in which B is a positive 
number, and let K be a map from U to the real-valued functions defined 
on Qa such that K is causal, time invariant and an element of 2/(U) in 
the sense of Section 2.1 with Q replaced with Q,. Let K satisfy the 
continuity condition that given y € U and numbers t € {1, 2, --- } and 
6; > 0, there is a 62 > 0 such that | (Ky)(t) — (Kz)(t)| < 6, whenever 
z€ Uand max,eqo,...;| y(7) — z(7) | < 62. Then, given any « > 0, there 
is a positive integer Q, and elements k, of k(q) (1 < q < Q) such that 


| (Ku)(t) — (Vu)(t)|<«6 tEQg 
for all u € U, where 


Q t t 
(Vu\(t)=Y Ys: d Ra(t1, +++, Tq)x[uU(t — 71), +++, u(t — 74). 


g=1 7=0 
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APPENDIX A 
Proof of the Proposition 


Let any 6 > 0, u € S, and t = 0 be given. Choose real 6; > 0 and 
62 > 0 such that 


261 + 2c62 < é, 
and let A; and A, be elements of 2 for which 


| (Fy)(7) — Fymextor-a(7)| <6, 720 (16) 
for y € H(S), and 
| (Hv)(r) ~~ HU max{o,r—A,](7) | < bo, 720 (17) 
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for v € S. Choose A € 2 so that A > A; + Ag, and consider 
@ A |(FHu)(t) — FHumaxioz—aj(t) |. 
If ¢ € [0, A], u = Umaxjo,e—a] and ¢ = 0. Now let t > A. Clearly, 
o = |(FHu)(t) — (FHa)(t)|, 
where U = Ua). Using (16), we have 
| (FHu)(t) — F(Hu)u-ap(t) | < 61, 
and 
| (FH) (t) — F(Ha)q-a)(t)| < 5, 
and one finds that 
@ < |(FHu)(t) — F( Au)e-ap(t) + F( Au) i-ay(t) 
— (FHa)(t) + F(At)o-ay(t) — FC Hd) @-a(0) | 
< 26, +c sup , (Hu)(r) — (Hu) (r) |. 


t€[(t—A}),t 


By (17), |(Hu)(t) — Hug—a(r) | < 62 and | (Ha)(r) — Hie-ay(7) | < 
do for tr > Ao. Note that 7 = (t — A;) and t > A =>7 > Ag; and that 
for 7 = (t — Ay), Hug -a,(7) = Hi,-2,(7). Thus, 


sup A, (Hu)(r) — (Hi)(r) | < 262, 


Tr&[(t—A)),¢ 


which shows that ¢ < 26, + 2cé,. Since this implies that ¢ < 6, the 
proposition is proved. 


APPENDIX B 
An Example of an Application of Theorem 2 


In this Appendix we consider systems governed by the model 


y = Nx (18) 
x = Av + Cy (19) 
w = Du + By, (20) 


in which v is the input, w is the output, A, B, C, and D are linear 
operators, N is nonlinear, and x and y can be viewed as the input and 
output, respectively, of the nonlinear portion of the system. Models of 
this kind have been used in Ref. 2 and in other papers. Here we 
suppose that v, w, x, and y belong to L..(n), L.(1), L.(p), and L.(p), 
respectively, that N is memoryless and defined by (Nx)(t) = n[x(t)] 
where 7 is a map from JR? to IR? which takes the zero element of IR? 
into itself, that n satisfies a global Lipschitz condition | n(x.) — n(x») | 
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< y|X%a — x,| where | -| is as in the definition of the norm in L..(p), 
and that A, B, C, and D, respectively, are causal time-invariant 
bounded linear maps from L..(n) to L.(p), L.(p) to L.(1),, Lo(p) to 
L.(p), and L..(n) to L.(1). In particular, we assume that C has the 
convolution representation 


(Cy)(t) = i: c(t — r)y¥(7)dr, t=0 


for y € L.(p), where c(-) is p X p and has integrable elements [that 
is, has elements that are integrable on [0, )]. The equations of a very 
large class of systems with a single output can be put in this form with 
A, B, and D convolutions whose matrix-valued kernels are either 
integrable, or integrable with the exception of an impulse at the origin 
(see Ref. 2, Appendices I and II). 


B.1 Further assumptions, and approximations 


Assume in the remainder of this Appendix that (I — CN) is an 
invertible map of L..(p) onto L..(p), where I is the identity operator 
on L,.(p), and that (I — CN)~ is causal, time invariant, and globally 
Lipschitz. Conditions under which these assumptions are met can be 
obtained from standard existence theory and results in the area of 
stability theory [see, for example, Ref. 20, Theorem 3 and Corollary 
3(a)]. It follows that w = Dv + BN(I — CN)? Av for all v € L..(n). 

With r an arbitrary positive constant, let us now restrict our inputs 
v to the ball A = {v € L.(n):|lv]] <r}. Let Ay = {u € L.(p):|Jul] < 
r||A||}. In addition to the assumptions introduced above, suppose that 
A is a convolution with an integrable kernel, and that (I — CN)", 
which takes A, into L..(p), belongs to 2/(A;,) in the sense of Section 
2.1. Using the proposition in Section 2.3, and by considering one 
component of (I — CN)! A at a time, we see that (I - CN) AE 
(A), since, as can easily be verified, A € (A). Similarly, N(J — 
CN)! A:A — L..(p) and finally BN(I — CN)" A both belong to .e7(A). 
Thus, by Theorem 2 [with H = A and F = BN(I — CN)“"], and in the 
sense of Theorem 2, BN(I — CN)7' A can be approximated arbitrarily 
well on A by a finite Volterra series. 

Before proceeding to the important matter of how one might show 
that (I — CN)! € (A,) under some reasonable conditions, suppose 
that the assumptions described above are met, with the exception that 
A is not a convolution. Assume instead that (Av)(t) = av(t), where a 
is a p X n matrix of constants. (This case arises naturally in the study 
of feedback systems.) Using the identity (I - CN)~' = (I — CN)"'CN 
+I, one has w = Du + BNAv + BN(I — CN)“'CNAv. The term BNAv 
has a simple representation as is; and if, for example, B is a convolution 
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with an integrable kernel and n = p = 1, then, by the Weierstrass 
approximation theorem for real-valued continuous functions on a 
compact real interval, it is clear that it can be approximated arbitrarily 
well on A by a finite series having the form 


Q t 
y f by(t — 7)v(7)%d7, t>0. (21) 


Consider now the more interesting term BN(I — CN)"'CNAv. By 
Theorem 2 (this time with H = C) we see that it can be approximated 
arbitrarily well on A by a finite series of the form (11) with each 
u(t — 7;) replaced with n[av(t — 7;)]. In particular, using the fact that 
the k, in (11) satisfy 


i ia | Raj (71, PS g Tq) |d(n, aaa 75) <0 


for each j, and the Weierstrass approximation theorem for real-valued 
continuous functions of several real variables, it follows that the term 
can be approximated arbitrarily well throughout A by a finite Volterra- 
like series in the sense of the sets of iterated integrals K(m) in Ref. 2. 
[These Volterra-like series, which are frequently needed in exact 
expansion representations, can be viewed as Volterra series with 
symbolic kernels that include certain delta functions. A simple exam- 
ple of a Volterra-like series is (21).] 


B.2 (1 — CN)" and the memory condition 

The hypothesis that (I — CN)~' € (A,) plays a key role in the 
discussion above. We begin our comments concerning this hypothesis 
with the observation that with arbitrary t = 0, A 20, and u € Ay, one 
has [(I — CN)~'u](t) — [U7 — CN)~"umaxto,t—-aj](t) equal to x(t) — Z(t), 
where x and x are elements of L..(p) such that 


x(a) + f c(a — T)n[x(7)]dr = u(a) (22) 
X(a) + i) c(a — 7)n[X(7)]dr = Umaxio,e—aj(@) (23) 
for a = 0. 
With o any positive constant, 
yla) + it C(a — r)aly(r), T]dr = 2(a) (24) 
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Ha) + f e(a — r)nly(r), T]d7 = 2(a) (25) 


for a € [0, 0%), where y(a) = x(a)e”, ¥(a) = x(a)e”, C(a) = c(a)e”, 
nly (7), 7] = e’nle"y(7)], 2(a) = ula)e”, and 2(a) = e€’*Umax{o,t—a](@). 
Let 2 denote the set of positive o such that the elements of ¢ are 
square integrable, and suppose that > is not empty. Let us now make 
the key assumption, which we shall refer to as A.0, that from (24) and 
(25) it can be concluded that for some o € 2 there is a constant \ 
which depends only on c, 7, and o such that 
ly —YlleSAllz — Zlle, 


where 


ly -— lz = f [y(t) — H(t)]"[y(t) — H(t) ]de, 

“Tr” denotes the transpose, and similarly for |]z — Z]]2. Much is known 
about conditions under which A.0 is met [see Ref. 20, Corollary 1(a) 
and Theorem 6], and it is known that:A.0 is met in certain specific 
important cases. 

For ¢t < A, obviously x(t) — x(t) = 0. Now let t> A. 

Notice that ||y — J'lp < €e" where & = Ap’/?r|| A ||. Using (24) and 
(25), and the Schwarz inequality, 


p t 1/2 
ly(t) —y()| < max ») (J ees ar] 


t 1/2 
( i [al9(), 7] - alo), sitar] , 


j=l 


Since 
f | ALF(r), 2) — Aly), 71 PPar 
= f | ald(r), 7] — aly (7), r]|?dr < y? f I(r) — y(r) [Pdr 


=" i [S(r) — yO) IH(7) — y(r)]dz < 715 — yd, 


we find that 
| y(t) — 9(t) | < Exyée"™, 
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where 


P t 1/2 
£, = max ), (J \oery Far) . 
t ° j=1 0 


|x(t) — x(t)| < E:yée7™. 


Thus, 


This shows that (I — CN)~! € (A) under the conditions described, 
and therefore that the input-output maps of a very large class of 
systems have finite Volterra series approximations in the strong sense 
of this Appendix. [For example, using material in Ref. 20 (see the 
comment at the bottom of p. 875 there), it is not difficult to show that 
this class includes a large family of electrical networks consisting of 
sources, passive elements, and monotone nonlinear resistors. | 
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